Instructions to use sasa2000/Alpamayo-R1-10B-Text-Only with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sasa2000/Alpamayo-R1-10B-Text-Only with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sasa2000/Alpamayo-R1-10B-Text-Only")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sasa2000/Alpamayo-R1-10B-Text-Only")
model = AutoModelForCausalLM.from_pretrained("sasa2000/Alpamayo-R1-10B-Text-Only")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use sasa2000/Alpamayo-R1-10B-Text-Only with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sasa2000/Alpamayo-R1-10B-Text-Only"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sasa2000/Alpamayo-R1-10B-Text-Only",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sasa2000/Alpamayo-R1-10B-Text-Only

SGLang

How to use sasa2000/Alpamayo-R1-10B-Text-Only with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sasa2000/Alpamayo-R1-10B-Text-Only" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sasa2000/Alpamayo-R1-10B-Text-Only",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sasa2000/Alpamayo-R1-10B-Text-Only" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sasa2000/Alpamayo-R1-10B-Text-Only",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use sasa2000/Alpamayo-R1-10B-Text-Only with Docker Model Runner:
```
docker model run hf.co/sasa2000/Alpamayo-R1-10B-Text-Only
```

Alpamayo-R1-10B Text-Only

This is a text-only extraction of nvidia/Alpamayo-R1-10B, also known as Alpamayo 1.

The original checkpoint is a vision-language-action model with:

a Qwen3-VL/Cosmos-style VLM backbone,
a vision tower,
a diffusion/action expert,
trajectory/action projection modules.

This repository keeps only the language backbone from vlm.model.language_model.* plus vlm.lm_head.weight, and saves it as a standalone Hugging Face Qwen3ForCausalLM checkpoint.

What Changed

Source model: nvidia/Alpamayo-R1-10B
Output architecture: Qwen3ForCausalLM
Output model_type: qwen3
Kept tensors: 399
Dropped tensors: 767
Output weights: 4 safetensors shards
Removed components include vlm.model.visual.*, expert.*, action_in_proj.*, action_out_proj.*, and action_space.*

The source repository does not include tokenizer files. The tokenizer here is based on Qwen/Qwen3-VL-8B-Instruct and extended with Alpamayo placeholder special tokens up to the model vocabulary size 155697. For GGUF conversion compatibility, the tokenizer config stores the Alpamayo placeholder tokens in additional_special_tokens, and the BPE vocab.json / merges.txt files are included alongside tokenizer.json.

Validation

Validated locally with:

torch 2.12.1+cpu
transformers 5.12.1
safetensors 0.8.0

Checks performed:

AutoConfig.from_pretrained(...) loads as Qwen3Config
AutoTokenizer.from_pretrained(...) loads as Qwen2Tokenizer
tokenizer length is 155697
AutoTokenizer.from_pretrained(...) loads without extra_special_tokens compatibility errors in current Transformers
AutoModelForCausalLM.from_pretrained(...) loads as Qwen3ForCausalLM
Forward pass succeeds on a short text prompt
Output logits shape: (1, 10, 155697)
No visual, vision, projector, language_model, expert, action_*, or vlm.* tensor names remain in the exported checkpoint

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_dir = "path/to/alpamayo_r1_10b_text_only"

tokenizer = AutoTokenizer.from_pretrained(model_dir, fix_mistral_regex=True)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    torch_dtype="auto",
    device_map="auto",
)

inputs = tokenizer("Explain a safe driving decision at a busy intersection.", return_tensors="pt").to(model.device)
with torch.no_grad():
    output_ids = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

Limitations

This checkpoint is text-only. It does not include the original vision tower, robotics/action expert, diffusion trajectory decoder, multimodal processors, or trajectory decoding logic.

This is an unofficial derived checkpoint and is not released by NVIDIA.

License

The source model states that its weights are released under a non-commercial license. Use of this derived checkpoint must comply with the original model license and any applicable terms.

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for sasa2000/Alpamayo-R1-10B-Text-Only

Base model

nvidia/Alpamayo-R1-10B

Finetuned

(1)

this model