--- license: apache-2.0 tags: - pi0.5 - openpi - gguf - quantized - vla base_model: zestcode5/ur3-5task-v1 --- # Pi 0.5 UR3 5-task — GGUF (Q8_0 LLM + Q4_K vision + Q5_K embed) Quantized GGUF export of `zestcode5/ur3-5task-v1` (step 6000) for inference with [OmniModel.cpp](https://github.com/) Pi 0.5 C++ runtime. ## Quantization | Component | Quant | bpw | |------------------|---------|-------| | Vision (SigLIP) | Q4_K | 4.55 | | Embedding | Q5_K | 5.50 | | PaliGemma LLM | Q8_0 | 8.50 | | Action expert | F16 | 16.0 | | **Total file** | mixed | **8.47** | File size: ~3.6 GB. V+LLM avg bpw: 7.38. ## Files - `pi05.gguf` — unified GGUF (vision + projector + LLM + action expert + embedding + norm stats). - `tokenizer.model` — SentencePiece tokenizer (PaliGemma). - `norm_stats.json` — action/state mean/std/q01/q99 from the UR3 5-task dataset. ## Tasks The base model was fine-tuned on `zestcode5/ur3-merged-5tasks-v1`. Calibration excluded "open the pot by removing its lid". Supported task prompts: - `pick up the pink cylinder and place it in the orange box` - `pick up the white glass and put on a brown coaster` - `Remove cup from nested cups` - `Single-finger push to blue marker` ## Inference CLI: ``` ./bin/pi05 -m /path/to/this/dir -i frame.png -p "" -d CUDA -s 10 ``` WebSocket policy server (with the modified `serve_policy.py`): ``` uv run scripts/serve_policy.py policy:gguf \ --policy.dir=/path/to/this/dir \ --policy.device=CUDA --policy.steps=10 --policy.action-dim=7 \ --port=8000 ``` Robot client uses OpenPI's `WebsocketClientPolicy` and sends an obs dict with keys `observation.images.fixed`, `observation.images.cam_wrist`, `observation.state` (11-dim), and `prompt`.