Instructions to use nanonets/Nanonets-OCR-s with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nanonets/Nanonets-OCR-s with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="nanonets/Nanonets-OCR-s")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("nanonets/Nanonets-OCR-s")
model = AutoModelForImageTextToText.from_pretrained("nanonets/Nanonets-OCR-s")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use nanonets/Nanonets-OCR-s with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nanonets/Nanonets-OCR-s"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/nanonets/Nanonets-OCR-s

SGLang

How to use nanonets/Nanonets-OCR-s with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nanonets/Nanonets-OCR-s" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nanonets/Nanonets-OCR-s" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nanonets/Nanonets-OCR-s",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use nanonets/Nanonets-OCR-s with Docker Model Runner:
```
docker model run hf.co/nanonets/Nanonets-OCR-s
```

Optimizing Input for Faster Rendering with NanoNet

#34

by FastaLaPasta - opened Sep 25, 2025

Discussion

FastaLaPasta

Sep 25, 2025

Hi, I'm wondering if there are ways to optimize the input I send to the model in order to achieve faster rendering times. Are there any recommended preprocessing steps, input formats, or parameters (liek resolution, token length, or batch size) that could help improve inference speed?
thanks :)

Souvik3333

Nanonets org Sep 25, 2025

Hello, So assuming you are using vLLM for deployment.

Generally when you increase resolution the processing time will increase and acc will increase. But I have seen the acc drop after increasing the width more than 3000. You can even decrease resolution incase your document is not complex or does not have any dense text
Increasing batch size will increase the throughput of the model overall but will not reduce single request processing time. If you are using large gpu you should always try to use as much batch size as possible.
By token length I assuming you mean generation length? Dense text will take more time because these models are auto-regressive (they generate one token at a time). But if you want to convert your document fully to markdown you cannot do anything here. For dense documents it will take more than than documents with less text.
Using better GPUs will make the computation faster. Or you can try https://docstrange.nanonets.com. We give free processing of 10k docs per month.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment