Chatterbox Multilingual TTS — 8-bit Quantized

8-bit quantized port of ResembleAI's Chatterbox Multilingual TTS, reducing memory footprint while preserving the original 23-language coverage.

This repository contains only the quantized weights plus auxiliary text-processing files. Model weights, architecture, and training are entirely ResembleAI's work — all credit for the underlying model goes to the Chatterbox team. This port adds only 8-bit quantization and bundles per-language text-processing helpers.

What's included

File / Directory	Role
`model.safetensors`	8-bit quantized model weights (1.33 GB)
`config.json`	Model config
`tokenizer.json` / `tokenizer_config.json` / `vocab.txt`	Tokenizer
`Cangjie5_TC.json`	Traditional Chinese Cangjie input-method dictionary (Chinese text preprocessing)
`russian_stress_dict.json.gz`	Russian word-stress dictionary (stress mark insertion for better pronunciation)
`HebrewDiacritization.mlmodelc` / `.mlpackage`	Core ML model for adding nikud (Hebrew vowel marks) so Hebrew text renders pronounceably

The auxiliary files cover languages where the written script doesn't fully specify pronunciation. Load them alongside the main model to get quality comparable to the float-precision original for Chinese, Russian, and Hebrew.

Languages

23 languages, matching the ResembleAI original: Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish.

Model details

Base parameter count: ~0.3B (matches ResembleAI original)
Quantization: 8-bit weights
Format: Safetensors (tensor dtypes: F32 for scales/biases, U32 for packed int8 weights)
Features preserved from base: zero-shot voice cloning, emotion exaggeration, alignment-informed inference

Targets

Mixed: the main model + tokenizer artifacts are framework-agnostic Safetensors, usable anywhere Chatterbox is. The bundled HebrewDiacritization.mlmodelc / .mlpackage is a Core ML model intended for on-device Apple platforms (iOS 17+, macOS 14+). If you're running on another platform, swap in your preferred Hebrew nikud source.

Quantization

Quantized from the original float checkpoint. Accuracy vs. the float baseline depends on your workload — run audio-quality A/B checks against the ResembleAI original if exact parity matters for your use case.

License & Attribution

This port inherits the MIT license from ResembleAI. See the original Chatterbox model card for terms.

The model weights, architecture, and training are ResembleAI's work. This repository provides only 8-bit quantization and bundled text-processing auxiliaries. Please cite and credit ResembleAI for any use of the underlying model.

Model tree for smdesai/Chatterbox-Multilingual-TTS-8bit

Base model

ResembleAI/chatterbox

Quantized

(18)

this model

smdesai
/

Chatterbox-Multilingual-TTS-8bit