Instructions to use nanonets/Nanonets-OCR-s with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nanonets/Nanonets-OCR-s with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="nanonets/Nanonets-OCR-s")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("nanonets/Nanonets-OCR-s")
model = AutoModelForImageTextToText.from_pretrained("nanonets/Nanonets-OCR-s")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use nanonets/Nanonets-OCR-s with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nanonets/Nanonets-OCR-s"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/nanonets/Nanonets-OCR-s

SGLang

How to use nanonets/Nanonets-OCR-s with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nanonets/Nanonets-OCR-s" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nanonets/Nanonets-OCR-s" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use nanonets/Nanonets-OCR-s with Docker Model Runner:
```
docker model run hf.co/nanonets/Nanonets-OCR-s
```

RuntimeError: weight lm_head.weight does not exist error when deploying to HF inference endpoint.

#29

by cheeseburgerhere - opened Jul 21, 2025

Discussion

cheeseburgerhere

Jul 21, 2025

•

edited Jul 21, 2025

When I try to deploy this model on L4 GPU (AWS) in Huggingface endpoints it gives lm_head.weight missing error everytime.
Container: Text Generation Inference
Task: Image-Text-to-Text
Error Message:
Error when initializing model

Traceback (most recent call last):
File "/usr/src/.venv/bin/text-generation-server", line 10, in
sys.exit(app())
File "/usr/src/.venv/lib/python3.11/site-packages/typer/main.py", line 323, in call
return get_command(self)(*args, **kwargs)
File "/usr/src/.venv/lib/python3.11/site-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
File "/usr/src/.venv/lib/python3.11/site-packages/typer/core.py", line 740, in main
return _main(
File "/usr/src/.venv/lib/python3.11/site-packages/typer/core.py", line 195, in _main
rv = self.invoke(ctx)
File "/usr/src/.venv/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/src/.venv/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/src/.venv/lib/python3.11/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "/usr/src/.venv/lib/python3.11/site-packages/typer/main.py", line 698, in wrapper
return callback(**use_params)
File "/usr/src/server/text_generation_server/cli.py", line 119, in serve
server.serve(
File "/usr/src/server/text_generation_server/server.py", line 313, in serve
asyncio.run(
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
self.run_forever()
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/root/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/usr/src/server/text_generation_server/server.py", line 266, in serve_inner
model = get_model_with_lora_adapters(
File "/usr/src/server/text_generation_server/models/init.py", line 1816, in get_model_with_lora_adapters
model = get_model(
File "/usr/src/server/text_generation_server/models/init.py", line 1545, in get_model
return VlmCausalLM(
File "/usr/src/server/text_generation_server/models/vlm_causal_lm.py", line 720, in init
super().init(
File "/usr/src/server/text_generation_server/models/flash_causal_lm.py", line 1269, in init
model = model_class(prefix, config, weights)
File "/usr/src/server/text_generation_server/models/custom_modeling/qwen2_5_vl.py", line 830, in init
self.lm_head = SpeculativeHead.load(
File "/usr/src/server/text_generation_server/layers/speculative.py", line 40, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/usr/src/server/text_generation_server/layers/tensor_parallel.py", line 66, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/usr/src/server/text_generation_server/utils/weights.py", line 213, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/usr/src/server/text_generation_server/utils/weights.py", line 192, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment