Instructions to use docling-project/SmolDocling-256M-preview-mlx-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use docling-project/SmolDocling-256M-preview-mlx-bf16 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="docling-project/SmolDocling-256M-preview-mlx-bf16")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("docling-project/SmolDocling-256M-preview-mlx-bf16")
model = AutoModelForImageTextToText.from_pretrained("docling-project/SmolDocling-256M-preview-mlx-bf16")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

MLX

How to use docling-project/SmolDocling-256M-preview-mlx-bf16 with MLX:

# Make sure mlx-vlm is installed
# pip install --upgrade mlx-vlm

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

# Load the model
model, processor = load("docling-project/SmolDocling-256M-preview-mlx-bf16")
config = load_config("docling-project/SmolDocling-256M-preview-mlx-bf16")

# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."

# Apply chat template
formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=1
)

# Generate output
output = generate(model, processor, formatted_prompt, image)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

vLLM

How to use docling-project/SmolDocling-256M-preview-mlx-bf16 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "docling-project/SmolDocling-256M-preview-mlx-bf16"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview-mlx-bf16",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/docling-project/SmolDocling-256M-preview-mlx-bf16

SGLang

How to use docling-project/SmolDocling-256M-preview-mlx-bf16 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "docling-project/SmolDocling-256M-preview-mlx-bf16" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview-mlx-bf16",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "docling-project/SmolDocling-256M-preview-mlx-bf16" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "docling-project/SmolDocling-256M-preview-mlx-bf16",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use docling-project/SmolDocling-256M-preview-mlx-bf16 with Docker Model Runner:
```
docker model run hf.co/docling-project/SmolDocling-256M-preview-mlx-bf16
```

RuntimeError: Failed to import transformers.generation.utils because of the following error: register_pytree_node() got an unexpected keyword argument 'flatten_with_keys_fn'

by sreesdas - opened Mar 21, 2025

Discussion

sreesdas

Mar 21, 2025

Traceback (most recent call last):
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1968, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sree/miniconda3/lib/python3.12/importlib/init.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1387, in _gcd_import
File "", line 1360, in _find_and_load
File "", line 1331, in _find_and_load_unlocked
File "", line 935, in _load_unlocked
File "", line 995, in exec_module
File "", line 488, in _call_with_frames_removed
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/generation/utils.py", line 30, in
from transformers.generation.candidate_generator import AssistantVocabTranslatorCache
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/generation/candidate_generator.py", line 29, in
from ..cache_utils import DynamicCache
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/cache_utils.py", line 589, in
torch.utils._pytree.register_pytree_node(
TypeError: register_pytree_node() got an unexpected keyword argument 'flatten_with_keys_fn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1968, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sree/miniconda3/lib/python3.12/importlib/init.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1387, in _gcd_import
File "", line 1360, in _find_and_load
File "", line 1331, in _find_and_load_unlocked
File "", line 935, in _load_unlocked
File "", line 995, in exec_module
File "", line 488, in _call_with_frames_removed
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py", line 32, in
from .auto_factory import _LazyAutoMapping
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 40, in
from ...generation import GenerationMixin
File "", line 1412, in _handle_fromlist
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1956, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/sree/miniconda3/lib/python3.12/site-packages/transformers/utils/import_utils.py", line 1970, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
register_pytree_node() got an unexpected keyword argument 'flatten_with_keys_fn'

The above exception was the direct cause of the following exception:

sreesdas changed discussion title from RuntimeError: Failed to import transformers.generation.utils because of the following error to RuntimeError: Failed to import transformers.generation.utils because of the following error: register_pytree_node() got an unexpected keyword argument 'flatten_with_keys_fn' Mar 21, 2025

sreesdas changed discussion status to closed Mar 21, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment