SVG Generator

Generative Neural Models for Scalable Vector Graphics


_{a simple lighthouse icon}	_{A minimal rocket icon}	_{A cute cat face icon}	_{A magic potion bottle with bubbles, sparkles, cork, and label}	_{A cozy cabin in the woods with a chimney, pine trees, and a crescent moon}	_{A steampunk hot air balloon with gears, ropes, basket, and brass details}

This repository contains the published artifacts for a two-stage text-to-SVG pipeline. Stage 1 uses Z-Image with an SVG-domain LoRA to generate a raster illustration. Stage 2 converts the raster to SVG, either with vtracer or with the included flow-matching vectorizer.

Code, training scripts, and end-to-end inference are available in the GitHub repository: https://github.com/JosefKuchar/svg-generator

Quick Start

Clone the project repository and install dependencies:

git clone https://github.com/JosefKuchar/svg-generator.git
cd svg-generator
uv sync

For faster transformer attention on compatible CUDA systems, the project README contains an optional FlashAttention wheel install command. For the default SVG conversion path, install vtracer and make sure it is available on PATH.

Run text-to-SVG inference:

uv run python infer.py "a simple lighthouse icon" \
  --output-svg lighthouse.svg \
  --output-png lighthouse.png \
  --preview-png lighthouse-preview.png

The default prompt prefix is:

SVG illustration with white background.

It can be changed with --prompt-prefix.

Inference Options

Stage 1 can use either Z-Image Base or Turbo:

# Default: Tongyi-MAI/Z-Image, 50 steps, guidance 4.0
uv run python infer.py "A minimal rocket icon" --z-image base

# Faster: Tongyi-MAI/Z-Image-Turbo, 9 steps, guidance 0.0
uv run python infer.py "A minimal rocket icon" --z-image turbo

The default stage-2 vectorizer is vtracer:

uv run python infer.py "A cute cat face icon" \
  --vectorizer vtracer \
  --output-svg cat.svg

The neural flow-matching vectorizer can be selected explicitly:

uv run python infer.py "A cute cat face icon" \
  --vectorizer flow-matching \
  --flow-steps 50 \
  --flow-cfg-scale 1.0 \
  --max-segments 256 \
  --output-svg cat.svg

Direct LoRA Loading

The Z-Image SVG LoRA can be loaded with the standard Diffusers ZImagePipeline:

import torch
from diffusers import ZImagePipeline

repo_id = "JosefKuchar/svg-generator"
base_model = "Tongyi-MAI/Z-Image"

pipe = ZImagePipeline.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
)
pipe.load_lora_weights(repo_id, weight_name="zimage-svg-lora.safetensors")
pipe.to("cuda")

prompt = (
    "SVG illustration with white background. "
    "A simple icon of a mountain cabin surrounded by pine trees."
)

image = pipe(
    prompt=prompt,
    height=1024,
    width=1024,
    num_inference_steps=50,
    guidance_scale=4.0,
).images[0]

Artifacts

zimage-svg-lora.safetensors - LoRA adapter for Z-Image.
flow-matching/config.json - configuration for the neural vectorizer.
flow-matching/model.safetensors - flow-matching vectorizer weights.
assets/*.svg - example outputs shown above.

The flow-matching vectorizer is conditioned on facebook/dinov3-vits16-pretrain-lvd1689m. The DINOv3 encoder is loaded separately from its original Hugging Face repository and is not stored here.

Intended Use

Use these artifacts as an end-to-end text-to-SVG pipeline for icon-like and illustration-style prompts. The recommended path is to run infer.py from the GitHub repository, which applies the Z-Image LoRA, saves the intermediate raster image, and vectorizes it into SVG.

The model is best suited for clean drawings with simple shapes, flat colors, white backgrounds, and compact compositions. It is not intended as a general SVG editor or a replacement for manual vector design work. The LoRA itself only generates raster images; SVG output is produced by the second-stage vectorizer.

Downloads last month: 21

Model tree for JosefKuchar/svg-generator

Base model

Tongyi-MAI/Z-Image

Adapter

(136)

this model