Instructions to use ActGPT/psi0_base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use ActGPT/psi0_base with LeRobot:
- Notebooks
- Google Colab
- Kaggle
Psi-Zero base for LeRobot — Unitree G1 humanoid VLA
This repository repackages the published Psi-Zero baseline (Wei et al. 2026 — arXiv:2603.12263) as a LeRobot-loadable snapshot, with the action head expanded to the state / action / chunk dimensions used by the ActGPT Unitree G1 recording schema.
The weights are bit-identical to the upstream baseline up to a key rename and a zero-padding extension of the action-expert projection layers (no new parameter training has happened in this repo).
What this snapshot contains
model.safetensors merged state dict, ~6 GB
keys re-prefixed:
vlm_model.* → model.vlm_model.*
<action_header.*> → model.action_header.*
config.json PsiZeroConfig (max_state_dim=72,
max_action_dim=91,
chunk_size=30,
vlm_model_name='Qwen/Qwen3-VL-2B-Instruct')
train_config.json copy of config.json (LeRobot resume path)
README.md / LICENSE this file + Apache-2.0 text
Lineage
Qwen/Qwen3-VL-2B-Instruct (Alibaba; the base VLM)
↓ fine-tuned on EgoDex 200k + HE 30k via FAST tokenizer
USC-PSI-Lab/psi-model
:: psi0/pre.fast.1by1.2601091803.ckpt.ego200k.he30k/ (Stage-1 VLM, 4.3 GB)
:: psi0/postpre.1by1.pad36.2601131206.ckpt.he30k/ (Stage-2 action expert, 1.9 GB)
↓ extend action header 36/36/16 → 72/91/30
↓ re-key for LeRobot vlm_model.* → model.vlm_model.*
ActGPT/psi0_base (this repo)
What is different from the upstream Psi-Zero release
The upstream USC-PSI-Lab/psi-model ships the VLM and the action head as
two separate .safetensors files at the dimensions used by the paper's
G1 post-training run: odim=36, action_dim=36, action_chunk_size=16.
For fine-tuning on the ActGPT Unitree G1 recordings we need to load these
weights into a model with larger dimensions:
| Dimension | Upstream baseline | This snapshot | What changed |
|---|---|---|---|
odim (state input) |
36 | 72 | obs_proj._obs_proc.1.weight left-padded zero columns (36 → 72) |
action_dim |
36 | 91 | action_proj_in.ac_proj.0.{w,b} zero-padded both axes (36 → 91); action_proj_in.ac_proj.2.weight and action_proj_out.linear.{w,b} zero-padded the action axis |
action_chunk_size |
16 | 30 | action_proj_in.dec_pos xavier-extended on the chunk axis (16 → 30) |
The extension is parity-preserving on the first 36 action / state
dimensions when chunk size is unchanged (numerically verified in
actgpt-library/benchmark/psi0/RESULTS.md). Extending the chunk size
changes the action expert's attention context, so the output is no
longer identical on overlapping positions — this is expected when
adapting to a different chunk length and is fine for fine-tuning, which
will tune the freshly-initialised connections from zero.
No further training has been done on these weights — they are the upstream baseline, mechanically extended, ready for fine-tuning on a new task.
How to use (LeRobot fine-tuning)
from actgpt.policies.psi0 import PsiZeroConfig
import actgpt.policies # registers psi0 with LeRobot's policy factory
policy_config = PsiZeroConfig(
pretrained_path="ActGPT/psi0_base",
max_state_dim=72,
max_action_dim=91,
chunk_size=30,
n_action_steps=30,
freeze_vlm=True,
gradient_checkpointing=True,
)
Then drive lerobot-train (or the project's training/lerobot/scripts/finetune.py)
with this config as usual.
License
Apache-2.0. Same licence as both upstream sources:
USC-PSI-Lab/psi-model(Apache-2.0)Qwen/Qwen3-VL-2B-Instruct(Apache-2.0)
If you use this snapshot please cite the upstream Psi-Zero paper:
@misc{Wei2026psi0,
title={$\Psi_0$: An Open Foundation Model Towards Universal Humanoid Loco-Manipulation},
author={Songlin Wei and Hongyi Jing and Boqian Li and Zhenyu Zhao and Jiageng Mao and Zhenhao Ni and Sicheng He and Jie Liu and Xiawei Liu and Kaidi Kang and Sheng Zang and Weiduo Yuan and Marco Pavone and Di Huang and Yue Wang},
year={2026},
eprint={2603.12263},
archivePrefix={arXiv},
primaryClass={cs.RO},
}
And, if the EgoDex prior is relevant to your downstream analysis:
@inproceedings{Hoque2026egodex,
title={EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video},
author={Ryan Hoque and Peide Huang and David J. Yoon and Mouli Sivapurapu and Jian Zhang},
booktitle={ICLR 2026},
year={2026},
}
- Downloads last month
- 26
Model tree for ActGPT/psi0_base
Base model
Qwen/Qwen3-VL-2B-Instruct