Add pipeline tag, license, and improve model card documentation
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,4 +1,6 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
- UniCharacter
|
| 4 |
- customized-multimodal-role-play
|
|
@@ -6,14 +8,15 @@ tags:
|
|
| 6 |
- character-customization
|
| 7 |
- text-to-image
|
| 8 |
- image-generation
|
| 9 |
-
- arxiv:2605.08129
|
| 10 |
---
|
| 11 |
|
| 12 |
# UniCharacter
|
| 13 |
|
| 14 |
UniCharacter is a collection of character-specific checkpoints for **Customized Multimodal Role-Play (CMRP)**, introduced in the paper [Towards Customized Multimodal Role-Play](https://arxiv.org/abs/2605.08129).
|
| 15 |
|
| 16 |
-
|
|
|
|
|
|
|
| 17 |
|
| 18 |
## Repository Contents
|
| 19 |
|
|
@@ -21,26 +24,52 @@ This repository contains separate checkpoint folders for multiple characters. Ea
|
|
| 21 |
|
| 22 |
Available character folders include:
|
| 23 |
|
| 24 |
-
- `Adrien_Brody`
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
## Download
|
| 46 |
|
|
@@ -56,12 +85,6 @@ Download a single character checkpoint folder:
|
|
| 56 |
huggingface-cli download Tangc03/UniCharacter --include "Hermione/*" --local-dir UniCharacter
|
| 57 |
```
|
| 58 |
|
| 59 |
-
## Paper
|
| 60 |
-
|
| 61 |
-
This model is associated with the following paper:
|
| 62 |
-
|
| 63 |
-
- [Towards Customized Multimodal Role-Play](https://arxiv.org/abs/2605.08129)
|
| 64 |
-
|
| 65 |
## Citation
|
| 66 |
|
| 67 |
If you use UniCharacter, please cite:
|
|
@@ -69,8 +92,8 @@ If you use UniCharacter, please cite:
|
|
| 69 |
```bibtex
|
| 70 |
@article{tang2026towards,
|
| 71 |
title={Towards Customized Multimodal Role-Play},
|
| 72 |
-
author={Tang, Chao and Wu, Jianzong
|
| 73 |
journal={arXiv preprint arXiv:2605.08129},
|
| 74 |
year={2026}
|
| 75 |
}
|
| 76 |
-
```
|
|
|
|
| 1 |
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: any-to-any
|
| 4 |
tags:
|
| 5 |
- UniCharacter
|
| 6 |
- customized-multimodal-role-play
|
|
|
|
| 8 |
- character-customization
|
| 9 |
- text-to-image
|
| 10 |
- image-generation
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
# UniCharacter
|
| 14 |
|
| 15 |
UniCharacter is a collection of character-specific checkpoints for **Customized Multimodal Role-Play (CMRP)**, introduced in the paper [Towards Customized Multimodal Role-Play](https://arxiv.org/abs/2605.08129).
|
| 16 |
|
| 17 |
+
[**Project Page**](https://tangc03.github.io/UniCharacter.github.io/) | [**GitHub**](https://github.com/Tangc03/UniCharacter) | [**Paper**](https://arxiv.org/abs/2605.08129)
|
| 18 |
+
|
| 19 |
+
The model is designed to customize a character's persona, dialogue style, and visual identity so that the character can respond consistently across text and image generation settings. Using a unified multimodal model, UniCharacter employs a two-stage training framework containing Unified Supervised Finetuning (Unified-SFT) and character-specific group relative policy optimization (Character-GRPO).
|
| 20 |
|
| 21 |
## Repository Contents
|
| 22 |
|
|
|
|
| 24 |
|
| 25 |
Available character folders include:
|
| 26 |
|
| 27 |
+
- `Adrien_Brody`, `Bo`, `Butin`, `Chandler`, `Coco`, `Furina`, `Gao_Qiqiang`, `Hermione`, `Ichihime`, `Joey`, `Leonardo`, `Mam`, `Miki_Nikaidou`, `Mydieu`, `Pikachu`, `Rin_Tohsaka`, `Saber`, `Will_In_Vietnam`, `Wukong`, `YuiYagi`
|
| 28 |
+
|
| 29 |
+
## Quick Usage Example
|
| 30 |
+
|
| 31 |
+
To use these checkpoints, please follow the installation instructions in the [official repository](https://github.com/Tangc03/UniCharacter). Below is an example of the unified inference interface:
|
| 32 |
+
|
| 33 |
+
```python
|
| 34 |
+
from inference import create_unicharacter_inference
|
| 35 |
+
from pathlib import Path
|
| 36 |
+
|
| 37 |
+
# Initialize the unified inference (modify paths according to your environment)
|
| 38 |
+
inference = create_unicharacter_inference(
|
| 39 |
+
model_path="models/BAGEL-7B-MoT",
|
| 40 |
+
checkpoint_path="<checkpoint_path>",
|
| 41 |
+
vit_checkpoint_path="<vit_checkpoint_path>",
|
| 42 |
+
max_mem_per_gpu="40GiB",
|
| 43 |
+
seed=42,
|
| 44 |
+
)
|
| 45 |
+
|
| 46 |
+
out_dir = Path("test_images/outputs")
|
| 47 |
+
out_dir.mkdir(parents=True, exist_ok=True)
|
| 48 |
+
|
| 49 |
+
# 1) Text-to-image generation (Role T2I)
|
| 50 |
+
res = inference.generate_image("Ichihime chasing a butterfly")
|
| 51 |
+
res["image"].save(out_dir / "t2i_ichihime.png")
|
| 52 |
+
|
| 53 |
+
# 2) Visual understanding / VQA
|
| 54 |
+
res = inference.visual_understanding(
|
| 55 |
+
"data/personalized_data/train/Mahjong Soul-Ichihime/1.png",
|
| 56 |
+
"What's the color of Ichihime's hair?",
|
| 57 |
+
)
|
| 58 |
+
print("VQA:", res["text"])
|
| 59 |
+
|
| 60 |
+
# 3) Knowledge QA
|
| 61 |
+
res = inference.knowledge_qa("When do you born?")
|
| 62 |
+
print("Knowledge QA:", res["text"])
|
| 63 |
+
|
| 64 |
+
# 4) Multimodal role-play
|
| 65 |
+
res = inference.role_play(
|
| 66 |
+
character_name="Ichihime",
|
| 67 |
+
description="",
|
| 68 |
+
opening="",
|
| 69 |
+
user_text="Hi, Ichihime. How are you?",
|
| 70 |
+
)
|
| 71 |
+
print("Role-play:", res["response"])
|
| 72 |
+
```
|
| 73 |
|
| 74 |
## Download
|
| 75 |
|
|
|
|
| 85 |
huggingface-cli download Tangc03/UniCharacter --include "Hermione/*" --local-dir UniCharacter
|
| 86 |
```
|
| 87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
## Citation
|
| 89 |
|
| 90 |
If you use UniCharacter, please cite:
|
|
|
|
| 92 |
```bibtex
|
| 93 |
@article{tang2026towards,
|
| 94 |
title={Towards Customized Multimodal Role-Play},
|
| 95 |
+
author={Tang, Chao and Wu, Jianzong Manager, Shi, Qingyu and Tian, Ye and Zhang, Aixi and Jiang, Hao and Zhang, Jiangning and Tong, Yunhai},
|
| 96 |
journal={arXiv preprint arXiv:2605.08129},
|
| 97 |
year={2026}
|
| 98 |
}
|
| 99 |
+
```
|