Spaces:
Running
A newer version of the Gradio SDK is available: 6.15.2
title: SkinTokens
emoji: ๐จ
colorFrom: purple
colorTo: purple
sdk: gradio
sdk_version: 6.14.0
python_version: '3.11'
app_file: demo.py
pinned: false
models:
- VAST-AI/SkinTokens
SkinTokens: A Learned Compact Representation
for Unified Autoregressive Rigging
SkinTokens is a learned, compact, and discrete representation for skinning weights. Built on this representation, TokenRig is a unified autoregressive framework that models the entire rig, i.e., skeleton and skinning weights, as a single token sequence. Given an input 3D mesh, it generates a complete skeleton hierarchy and skin weights that can be directly imported into standard 3D pipelines for character animation and simulation.
SkinTokens is the successor to UniRig (SIGGRAPH '25). While UniRig uses separate stages for skeleton prediction and skinning, SkinTokens unifies both into a single autoregressive sequence via learned discrete skin tokens, yielding 98%โ133% improvement in skinning accuracy and 17%โ22% improvement in bone prediction over state-of-the-art baselines.
๐ฎ Overview
TokenRig takes a single 3D mesh as input and autoregressively produces a fully rigged asset โ a coherent skeleton hierarchy plus dense per-vertex skinning weights โ in a single unified sequence. Method in three stages:
- Learn SkinTokens โ An FSQ-CVAE compresses sparse skinning weights into a compact discrete vocabulary.
- Unified Autoregressive Modeling โ A Qwen3-0.6B-based Transformer generates the full rig (skeleton followed by SkinTokens) as one interleaved sequence.
- RL Refinement via GRPO โ Tailored geometric and semantic rewards (volumetric joint coverage, bone-mesh containment, skinning sparsity, deformation smoothness) refine the model for out-of-distribution assets.
Qualitative comparison of skinning prediction. TokenRig produces clean, locally coherent influence maps that closely match the ground truth, while baselines suffer from "bleeding" artifacts across disconnected mesh parts.
See the project page for the full teaser video and additional qualitative comparisons (skeleton generation and impact of GRPO).
๐ฆ Installation
Prerequisites
- Hardware: An NVIDIA GPU with at least 14 GB of memory is required for inference.
- Software:
- Python >= 3.11
- CUDA Toolkit >= 12.1
- uv is recommended for managing dependencies.
Installation Steps
Clone the repo:
git clone https://github.com/VAST-AI-Research/SkinTokens.git cd SkinTokensCreate a virtual environment and install PyTorch:
uv venv --python 3.11 source .venv/bin/activate uv pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128Adjust the CUDA version in the PyTorch index URL to match your driver. See PyTorch Get Started for other CUDA versions.
Install dependencies:
uv pip install -r requirements.txtInstall flash-attn:
uv pip install flash-attn --no-build-isolationDownload pretrained models:
python download.py --modelThis downloads the TokenRig and FSQ-CVAE checkpoints to
experiments/, and the Qwen3-0.6B config tomodels/.
๐ค Pretrained Models
We provide the following pretrained models on Hugging Face:
| Model | Description | Download |
|---|---|---|
articulation_xl_quantization_256_token_4 |
TokenRig autoregressive rigging model, trained on ArticulationXL 2.0 + VRoid Hub + ModelsResource and refined with GRPO (recommended) | Download |
skin_vae_2_10_32768 |
FSQ-CVAE (SkinTokens) โ skin-weight tokenizer used to encode and decode skinning weights | Download |
๐ก Usage
Hugging Face Space Demo
The easiest way to try TokenRig without any local setup is the hosted Hugging Face Space โ upload a mesh and get a rigged result in the browser.
Gradio Demo (local)
python demo.py
Then open http://127.0.0.1:1024 in your browser.
Command Line
# Rig a single model
python demo.py --input examples/giraffe.glb --output results/giraffe.glb
# Rig with original texture and scale preserved
python demo.py --input examples/giraffe.glb --output results/giraffe.glb --use_transfer
# Skin a model using its existing skeleton
python demo.py --input examples/giraffe_skeleton.glb --output results/giraffe.glb --use_skeleton --use_transfer
# Batch process a directory
python demo.py --input examples/ --output results/ --use_transfer
Generation Parameters
| Parameter | Default | Description |
|---|---|---|
--top_k |
5 | Top-k sampling |
--top_p |
0.95 | Top-p (nucleus) sampling |
--temperature |
1.0 | Sampling temperature |
--repetition_penalty |
2.0 | Repetition penalty |
--num_beams |
10 | Number of beams for beam search |
--use_skeleton |
False | Use existing skeleton (generate skin only) |
--use_transfer |
False | Transfer original texture and scale |
--use_postprocess |
False | Apply voxel-based skin postprocessing |
Troubleshooting
- Server fails to start: Make sure
http_proxy/https_proxyenvironment variables are unset or correctly configured. - Blender export issues: Remove the
glTF_not_exportednode when importing results into Blender.
๐ Acknowledgements
- UniRig โ the predecessor to this work.
- Qwen3 โ the LLM architecture used by the TokenRig autoregressive backbone.
- Michelangelo โ 3D shape encoder.
- 3DShape2VecSet โ shape-representation backbone used by the FSQ-CVAE.
- FSQ โ Finite Scalar Quantization, the discretization scheme behind SkinTokens.
- GRPO (DeepSeekMath) โ the policy-optimization method used for RL refinement.
- Tripo โ the 3D generative studio from Tripo, a broader context for this line of work.
We sincerely appreciate the contributions of these excellent projects and their authors. We believe open source helps accelerate research, lower barriers to innovation, and make progress more accessible to the broader community.
License
This project is licensed under the MIT License.
๐ Citation
If you find this work helpful, please consider citing our paper:
@article{zhang2026skintokens,
title = {SkinTokens: A Learned Compact Representation for Unified Autoregressive Rigging},
author = {Zhang, Jia-Peng and Pu, Cheng-Feng and Guo, Meng-Hao and Cao, Yan-Pei and Hu, Shi-Min},
journal = {arXiv preprint arXiv:2602.04805},
year = {2026}
}