Instructions to use smdesai/Chatterbox-Multilingual-TTS-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Chatterbox
How to use smdesai/Chatterbox-Multilingual-TTS-8bit with Chatterbox:
# pip install chatterbox-tts import torchaudio as ta from chatterbox.tts import ChatterboxTTS model = ChatterboxTTS.from_pretrained(device="cuda") text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill." wav = model.generate(text) ta.save("test-1.wav", wav, model.sr) # If you want to synthesize with a different voice, specify the audio prompt AUDIO_PROMPT_PATH="YOUR_FILE.wav" wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH) ta.save("test-2.wav", wav, model.sr) - Notebooks
- Google Colab
- Kaggle
Chatterbox Multilingual TTS — 8-bit Quantized
8-bit quantized port of ResembleAI's Chatterbox Multilingual TTS, reducing memory footprint while preserving the original 23-language coverage.
This repository contains only the quantized weights plus auxiliary text-processing files. Model weights, architecture, and training are entirely ResembleAI's work — all credit for the underlying model goes to the Chatterbox team. This port adds only 8-bit quantization and bundles per-language text-processing helpers.
What's included
| File / Directory | Role |
|---|---|
model.safetensors |
8-bit quantized model weights (1.33 GB) |
config.json |
Model config |
tokenizer.json / tokenizer_config.json / vocab.txt |
Tokenizer |
Cangjie5_TC.json |
Traditional Chinese Cangjie input-method dictionary (Chinese text preprocessing) |
russian_stress_dict.json.gz |
Russian word-stress dictionary (stress mark insertion for better pronunciation) |
HebrewDiacritization.mlmodelc / .mlpackage |
Core ML model for adding nikud (Hebrew vowel marks) so Hebrew text renders pronounceably |
The auxiliary files cover languages where the written script doesn't fully specify pronunciation. Load them alongside the main model to get quality comparable to the float-precision original for Chinese, Russian, and Hebrew.
Languages
23 languages, matching the ResembleAI original: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish.
Model details
- Base parameter count: ~0.3B (matches ResembleAI original)
- Quantization: 8-bit weights
- Format: Safetensors (tensor dtypes: F32 for scales/biases, U32 for packed int8 weights)
- Features preserved from base: zero-shot voice cloning, emotion exaggeration, alignment-informed inference
Targets
Mixed: the main model + tokenizer artifacts are framework-agnostic
Safetensors, usable anywhere Chatterbox is. The bundled
HebrewDiacritization.mlmodelc / .mlpackage is a Core ML model intended
for on-device Apple platforms (iOS 17+, macOS 14+). If you're running on
another platform, swap in your preferred Hebrew nikud source.
Quantization
Quantized from the original float checkpoint. Accuracy vs. the float baseline depends on your workload — run audio-quality A/B checks against the ResembleAI original if exact parity matters for your use case.
License & Attribution
This port inherits the MIT license from ResembleAI. See the original Chatterbox model card for terms.
The model weights, architecture, and training are ResembleAI's work. This repository provides only 8-bit quantization and bundled text-processing auxiliaries. Please cite and credit ResembleAI for any use of the underlying model.
Links
- Original model: https://huggingface.co/ResembleAI/chatterbox
- License: MIT
- Downloads last month
- 56
Model tree for smdesai/Chatterbox-Multilingual-TTS-8bit
Base model
ResembleAI/chatterbox