MOSS-TTS-Nano GGUF (codec only)
GGUF conversion of the MOSS-Audio-Tokenizer-Nano codec used by OpenMOSS-Team/MOSS-TTS-Nano, runnable via codec.cpp.
โ ๏ธ The LLM-part is not converted yet โ see Status.
Files
Codec-part (MOSS-Audio-Tokenizer-Nano, moss_audio arch, 16 RVQ codebooks ร 1024, 48 kHz stereo)
codec[-<quant>].gguf
| File | Size |
|---|---|
codec-f32.gguf |
84 MB |
codec-f16.gguf |
42 MB |
codec-q8_0.gguf |
24 MB |
codec-q5_k_m.gguf |
17 MB |
codec-q4_k_m.gguf |
15 MB |
Status
The MOSS-TTS-Nano LLM-part is a custom MossTTSNanoForCausalLM architecture that stock llama.cpp doesn't load:
- GPT-2 backbone with RoPE (
position_embedding_type: "rope",rope_base: 10000) โ llama.cpp'sgpt2arch only handles learned absolute positions, not RoPE - 16 RVQ codebooks (1024 entries each) emitted per audio frame
- Global + local transformer: the GPT-2 global produces a hidden state per timestep and a 1-layer local transformer expands it into the 16 codebook predictions
A working LLM-part GGUF would need a custom moss_tts_nano architecture inside llama.cpp (model loader, graph builder for global + local transformer, multi-codebook output handling). That's substantial work โ comparable to or larger than implementing Chatterbox-T3.
For now this repo only ships the codec; the LLM-part will follow if/when the architecture lands in llama.cpp.
Notes
- Source weights:
OpenMOSS-Team/MOSS-Audio-Tokenizer-Nano - Full upstream Python pipeline:
OpenMOSS-Team/MOSS-TTS-Nano
- Downloads last month
- 285
Hardware compatibility
Log In to add your hardware
4-bit
5-bit
8-bit
16-bit
32-bit
Model tree for hans00/MOSS-TTS-Nano-GGUF
Base model
OpenMOSS-Team/MOSS-TTS-Nano-100M