Instructions to use google/diffusiongemma-26B-A4B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/diffusiongemma-26B-A4B-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="google/diffusiongemma-26B-A4B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("google/diffusiongemma-26B-A4B-it")
model = AutoModelForMultimodalLM.from_pretrained("google/diffusiongemma-26B-A4B-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use google/diffusiongemma-26B-A4B-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/diffusiongemma-26B-A4B-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/diffusiongemma-26B-A4B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/google/diffusiongemma-26B-A4B-it

SGLang

How to use google/diffusiongemma-26B-A4B-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/diffusiongemma-26B-A4B-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/diffusiongemma-26B-A4B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/diffusiongemma-26B-A4B-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/diffusiongemma-26B-A4B-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use google/diffusiongemma-26B-A4B-it with Docker Model Runner:
```
docker model run hf.co/google/diffusiongemma-26B-A4B-it
```

Runtime error and Module not found error

by harishreddy09876543 - opened about 21 hours ago

Discussion

harishreddy09876543

about 21 hours ago

/usr/local/lib/python3.12/dist-packages/huggingface_hub/utils/_auth.py:112: UserWarning:
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
warnings.warn(
config.json: 100%
3.47k/3.47k [00:00<00:00, 373kB/s]

RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/transformers/utils/import_utils.py in getattr(self, name)
2261 try:
-> 2262 module = self._get_module(self._class_to_module[name])
2263 value = getattr(module, name)

27 frames
RuntimeError: operator torchvision::nms does not exist

The above exception was the direct cause of the following exception:

ModuleNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/transformers/utils/import_utils.py in getattr(self, name)
2348 ) from e
2349 else:
-> 2350 raise ModuleNotFoundError(
2351 f"Could not import module '{name}'. Are this object's requirements defined correctly?"
2352 ) from e

ModuleNotFoundError: Could not import module 'Gemma4VisionConfig'. Are this object's requirements defined correctly?

NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

JaeHyuns

about 19 hours ago

%pip uninstall -y torch torchvision torchaudio
%pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
%pip install -U transformers accelerate

feng197906

about 13 hours ago

dssfdsfdsfdsffdsfdsf

SuperbEmphasis

about 10 hours ago

Anyone running VLLM, as of yesterday you can use the gemma tagged containers:
https://hub.docker.com/layers/vllm/vllm-openai/gemma-cu129/images/sha256-e24d3d8e367ebc2a6ccfc8cf3cecee858a30a6c66b5b1642d85363cd745e001e

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment