MOSS-TTS-Nano GGUF (codec only)

GGUF conversion of the MOSS-Audio-Tokenizer-Nano codec used by OpenMOSS-Team/MOSS-TTS-Nano, runnable via codec.cpp.

โš ๏ธ The LLM-part is not converted yet โ€” see Status.

Files

Codec-part (MOSS-Audio-Tokenizer-Nano, moss_audio arch, 16 RVQ codebooks ร— 1024, 48 kHz stereo)

codec[-<quant>].gguf

File Size
codec-f32.gguf 84 MB
codec-f16.gguf 42 MB
codec-q8_0.gguf 24 MB
codec-q5_k_m.gguf 17 MB
codec-q4_k_m.gguf 15 MB

Status

The MOSS-TTS-Nano LLM-part is a custom MossTTSNanoForCausalLM architecture that stock llama.cpp doesn't load:

  • GPT-2 backbone with RoPE (position_embedding_type: "rope", rope_base: 10000) โ€” llama.cpp's gpt2 arch only handles learned absolute positions, not RoPE
  • 16 RVQ codebooks (1024 entries each) emitted per audio frame
  • Global + local transformer: the GPT-2 global produces a hidden state per timestep and a 1-layer local transformer expands it into the 16 codebook predictions

A working LLM-part GGUF would need a custom moss_tts_nano architecture inside llama.cpp (model loader, graph builder for global + local transformer, multi-codebook output handling). That's substantial work โ€” comparable to or larger than implementing Chatterbox-T3.

For now this repo only ships the codec; the LLM-part will follow if/when the architecture lands in llama.cpp.

Notes

Downloads last month
285
GGUF
Model size
22.1M params
Architecture
moss_audio_tokenizer
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hans00/MOSS-TTS-Nano-GGUF

Quantized
(1)
this model