NeuTTS Nano GGUF
GGUF conversions of Neuphonic's NeuTTS Nano โ a compact (~228 M params, ~117 M active) on-device TTS LM with instant voice cloning.
The stack is split so each runtime owns the part it's good at:
- LLM-part โ text-and-reference-codes to speech token IDs, stock
llamaarch, runs in llama.cpp. - Codec-part โ speech token IDs to/from 24 kHz PCM via NeuCodec, runs in codec.cpp.
Files
LLM-part (llama arch, vocab 194256, RoPE ฮธ=500000 + linear scaling x32)
neutts-nano-<quant>.gguf
| File | Size |
|---|---|
neutts-nano-f32.gguf |
925 MB |
neutts-nano-f16.gguf |
467 MB |
neutts-nano-bf16.gguf |
467 MB |
neutts-nano-q8_0.gguf |
253 MB |
neutts-nano-q6_k.gguf |
245 MB |
neutts-nano-q5_k_m.gguf |
217 MB |
neutts-nano-q5_k_s.gguf |
214 MB |
neutts-nano-q5_1.gguf |
216 MB |
neutts-nano-q5_0.gguf |
209 MB |
neutts-nano-q4_k_m.gguf |
210 MB |
neutts-nano-q4_k_s.gguf |
205 MB |
neutts-nano-q4_1.gguf |
202 MB |
neutts-nano-q4_0.gguf |
194 MB |
neutts-nano-q3_k_l.gguf |
200 MB |
neutts-nano-q3_k_m.gguf |
196 MB |
neutts-nano-q3_k_s.gguf |
190 MB |
neutts-nano-q2_k.gguf |
190 MB |
Codec-part (NeuCodec, neucodec arch, 65536 codebook, 24 kHz, 80 Hz token rate)
codec[-<quant>].gguf
| File | Size |
|---|---|
codec-f32.gguf |
928 MB |
codec-f16.gguf |
465 MB |
codec-q8_0.gguf |
326 MB |
codec-q5_k_m.gguf |
270 MB |
codec-q4_k_m.gguf |
252 MB |
Prompt format
The reference flow (from neuphonic/neutts):
user: Convert the text to speech:<|TEXT_PROMPT_START|>{ref_phones} {input_phones}<|TEXT_PROMPT_END|>
assistant:<|SPEECH_GENERATION_START|>{ref_codes_as_speech_tokens}
{ref_phones}/{input_phones}โ IPA-phonemized text (espeak-ng / phonemizer).{ref_codes_as_speech_tokens}โ concatenation of<|speech_N|>strings for each NeuCodec token of a reference clip.- Generation continues with more
<|speech_N|>tokens until<|SPEECH_GENERATION_END|>.
Speech-token vocab range: <|speech_0|> = id 128262 โฆ <|speech_65535|> = id 193797.
Stop token: <|SPEECH_GENERATION_END|> = id 128261.
Notes
- Reference encoding (audio โ speech tokens) needs the codec encode path.
- Phonemization isn't part of the GGUF โ apply espeak-ng (or equivalent) on the host before tokenizing prompts; raw English text will be off-distribution.
- Source weights:
neuphonic/neutts-nano(LLM) andneuphonic/neucodec(codec).
- Downloads last month
- 969
Hardware compatibility
Log In to add your hardware
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit
Model tree for hans00/NeuTTS-Air-GGUF
Base model
neuphonic/neutts-nano