Supertonic 3
Collection
by Supertone, converted to MLX β’ 2 items β’ Updated
How to use mlx-community/supertonic-3 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir supertonic-3 mlx-community/supertonic-3
Part of the Supertonic 3 MLX collection.
Apple MLX graph-runtime conversion of Supertone/supertonic-3, a compact multilingual TTS model distributed by upstream as ONNX assets.
| Format | JSON graph topology + NPZ initializers |
| Runtime | ailuntx/supertonic-mlx |
| Official code | supertone-inc/supertonic |
| Sample rate | 44.1 kHz |
| HF Space | mlx-community/supertonic-3 |
| Hardware | Runs on HF Linux CPU fallback; Apple Silicon recommended locally |
hf download mlx-community/supertonic-3 --local-dir ./models/supertonic-3
git clone https://github.com/ailuntx/supertonic-mlx.git
cd supertonic-mlx
python -m venv .venv
.venv/bin/pip install mlx soundfile numpy
.venv/bin/python scripts/infer_mlx.py \
--model ./models/supertonic-3 \
--text "Supertonic 3 is running with MLX." \
--lang en \
--voice M1 \
--total-step 8 \
--output output.wav
supertonic-3/
βββ README.md
βββ mlx_manifest.json
βββ graphs/
βββ weights/
βββ voice_styles/
| Component | Source | MLX handling |
|---|---|---|
| ONNX graphs | Supertone/supertonic-3 |
graph topology exported to JSON |
| initializers | official ONNX assets | saved as NPZ arrays |
| runtime ops | Supertonic ONNX subset | implemented in ailuntx/supertonic-mlx with MLX arrays |
The MLX graph runtime has been checked against ONNX Runtime on the official assets; per-stage maximum absolute errors are around 1e-5. The HF Space API has returned audio successfully with real wall-time status reporting.
Model license follows the upstream Supertonic 3 model card (openrail).
@misc{supertonic-mlx,
title = {supertonic-mlx: Apple MLX port of Supertonic 3},
author = {ailuntx},
year = {2026},
url = {https://github.com/ailuntx/supertonic-mlx},
}
@article{kim2025supertonic,
title = {SupertonicTTS: Towards Highly Efficient and Streamlined Text-to-Speech System},
author = {Kim, Hyeongju and Yang, Jinhyeok and Yu, Yechan and Ji, Seunghun and Morton, Jacob and Bous, Frederik and Byun, Joon and Lee, Juheon},
journal = {arXiv preprint arXiv:2503.23108},
year = {2025},
url = {https://arxiv.org/abs/2503.23108},
}
Base model
Supertone/supertonic-3