Instructions to use rednote-hilab/dots.mocr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rednote-hilab/dots.mocr with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="rednote-hilab/dots.mocr", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("rednote-hilab/dots.mocr", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use rednote-hilab/dots.mocr with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rednote-hilab/dots.mocr"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rednote-hilab/dots.mocr",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/rednote-hilab/dots.mocr

SGLang

How to use rednote-hilab/dots.mocr with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rednote-hilab/dots.mocr" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rednote-hilab/dots.mocr",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rednote-hilab/dots.mocr" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rednote-hilab/dots.mocr",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use rednote-hilab/dots.mocr with Docker Model Runner:
```
docker model run hf.co/rednote-hilab/dots.mocr
```

[DRAFT] fix: transformers 5.x compat (cache_position + kwargs naming)

by emanuelevivoli - opened 10 days ago

base: refs/heads/main

←

from: refs/pr/6

Discussion Files changed

+11

-3

Files changed (1) hide show

modeling_dots_ocr.py +11 -3

modeling_dots_ocr.py CHANGED Viewed

@@ -80,7 +80,7 @@ class DotsOCRForCausalLM(Qwen2ForCausalLM):
         return_dict: Optional[bool] = None,
         use_cache: Optional[bool] = None,
         logits_to_keep: int = 0,
-        **loss_kwargs,
     ) -> Union[Tuple, CausalLMOutputWithPast]:
         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
         assert len(input_ids) >= 1, f"empty input_ids {input_ids.shape=} will cause gradnorm nan"
@@ -99,7 +99,7 @@ class DotsOCRForCausalLM(Qwen2ForCausalLM):
             output_hidden_states=output_hidden_states,
             # return_dict=return_dict,
             logits_to_keep=logits_to_keep,
-            **loss_kwargs,
         )
         return outputs
@@ -125,7 +125,15 @@ class DotsOCRForCausalLM(Qwen2ForCausalLM):
             **kwargs,
         )
-        if cache_position[0] == 0:
             model_inputs["pixel_values"] = pixel_values
         return model_inputs

         return_dict: Optional[bool] = None,
         use_cache: Optional[bool] = None,
         logits_to_keep: int = 0,
+        **kwargs,
     ) -> Union[Tuple, CausalLMOutputWithPast]:
         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
         assert len(input_ids) >= 1, f"empty input_ids {input_ids.shape=} will cause gradnorm nan"
             output_hidden_states=output_hidden_states,
             # return_dict=return_dict,
             logits_to_keep=logits_to_keep,
+            **kwargs,
         )
         return outputs
             **kwargs,
         )
+        # Pass pixel_values only on the first generation step (prefill).
+        # Compatible with both transformers 4.x (cache_position available)
+        # and 5.x (cache_position removed, use past_key_values instead).
+        is_prefill = (
+            (cache_position is not None and cache_position[0] == 0)
+            or past_key_values is None
+            or (hasattr(past_key_values, "get_seq_length") and past_key_values.get_seq_length() == 0)
+        )
+        if is_prefill:
             model_inputs["pixel_values"] = pixel_values
         return model_inputs