NeuTTS Nano GGUF

GGUF conversions of Neuphonic's NeuTTS Nano โ€” a compact (~228 M params, ~117 M active) on-device TTS LM with instant voice cloning.

The stack is split so each runtime owns the part it's good at:

  • LLM-part โ€” text-and-reference-codes to speech token IDs, stock llama arch, runs in llama.cpp.
  • Codec-part โ€” speech token IDs to/from 24 kHz PCM via NeuCodec, runs in codec.cpp.

Files

LLM-part (llama arch, vocab 194256, RoPE ฮธ=500000 + linear scaling x32)

neutts-nano-<quant>.gguf

File Size
neutts-nano-f32.gguf 925 MB
neutts-nano-f16.gguf 467 MB
neutts-nano-bf16.gguf 467 MB
neutts-nano-q8_0.gguf 253 MB
neutts-nano-q6_k.gguf 245 MB
neutts-nano-q5_k_m.gguf 217 MB
neutts-nano-q5_k_s.gguf 214 MB
neutts-nano-q5_1.gguf 216 MB
neutts-nano-q5_0.gguf 209 MB
neutts-nano-q4_k_m.gguf 210 MB
neutts-nano-q4_k_s.gguf 205 MB
neutts-nano-q4_1.gguf 202 MB
neutts-nano-q4_0.gguf 194 MB
neutts-nano-q3_k_l.gguf 200 MB
neutts-nano-q3_k_m.gguf 196 MB
neutts-nano-q3_k_s.gguf 190 MB
neutts-nano-q2_k.gguf 190 MB

Codec-part (NeuCodec, neucodec arch, 65536 codebook, 24 kHz, 80 Hz token rate)

codec[-<quant>].gguf

File Size
codec-f32.gguf 928 MB
codec-f16.gguf 465 MB
codec-q8_0.gguf 326 MB
codec-q5_k_m.gguf 270 MB
codec-q4_k_m.gguf 252 MB

Prompt format

The reference flow (from neuphonic/neutts):

user: Convert the text to speech:<|TEXT_PROMPT_START|>{ref_phones} {input_phones}<|TEXT_PROMPT_END|>
assistant:<|SPEECH_GENERATION_START|>{ref_codes_as_speech_tokens}
  • {ref_phones} / {input_phones} โ€” IPA-phonemized text (espeak-ng / phonemizer).
  • {ref_codes_as_speech_tokens} โ€” concatenation of <|speech_N|> strings for each NeuCodec token of a reference clip.
  • Generation continues with more <|speech_N|> tokens until <|SPEECH_GENERATION_END|>.

Speech-token vocab range: <|speech_0|> = id 128262 โ€ฆ <|speech_65535|> = id 193797. Stop token: <|SPEECH_GENERATION_END|> = id 128261.

Notes

  • Reference encoding (audio โ†’ speech tokens) needs the codec encode path.
  • Phonemization isn't part of the GGUF โ€” apply espeak-ng (or equivalent) on the host before tokenizing prompts; raw English text will be off-distribution.
  • Source weights: neuphonic/neutts-nano (LLM) and neuphonic/neucodec (codec).
Downloads last month
969
GGUF
Model size
0.2B params
Architecture
neucodec
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hans00/NeuTTS-Air-GGUF

Quantized
(3)
this model