owensong (Owen Song)

liked a model about 4 hours ago

remixerdec/Inflect-Nano-v1-GGUF

Text-to-Speech • 3.47M • Updated about 7 hours ago • 1

updated a model about 8 hours ago

owensong/Inflect-Nano-v1

Text-to-Speech • Updated about 8 hours ago • 128

New activity in owensong/Inflect-Nano-v1 about 13 hours ago

Architecture

5

#1 opened 2 days ago by

yukiarimo

New activity in Luigi/Inflect-Nano-v1-ONNX 1 day ago

From the creator of Inflect: Thank you!

❤️ 1

#1 opened 1 day ago by

owensong

liked a Space 1 day ago

Inflect-Nano-v1 ONNX Demo

🗣

1

Tiny 4.63M-param English TTS, ONNX on CPU

liked a model 1 day ago

Luigi/Inflect-Nano-v1-ONNX

Text-to-Speech • Updated 1 day ago • 3

reacted to their post with 🔥 2 days ago

Post

6050

I just released Inflect-Nano-v1, an ultra-small 4.63 parameter text-to-speech model.

The main idea is simple: instead of only making the acoustic model tiny and relying on a larger external vocoder, Inflect-Nano-v1 keeps the complete text-to-waveform stack under 5M parameters.

Quick facts:
- 4.63M total inference parameters
- 3.46M acoustic model
- 1.17M vocoder
- 24 kHz audio
- English-only
- Single male voice
- Runs locally with a simple PyTorch inference script

Why I made it:
Most modern TTS models are much larger, and even many “small TTS” projects depend on a separate vocoder. I wanted to see how far a complete tiny TTS stack could be pushed while still producing usable speech.

It is not SOTA, and I am not trying to claim it competes with large TTS systems. The interesting part is the size-to-functionality ratio.

What works:
It can generate arbitrary English speech locally, and the model is small enough to be interesting for:

- local voice assistants
- embedded/edge experiments
- browser or WASM-style TTS exploration
- efficient inference research
- tiny-model baselines

Limitations:
The quality is still limited. It can sound robotic, stumble on difficult unseen text, and the vocoder is still a clear bottleneck. Long or unusual prompts are less reliable.

So I would frame this as a research/demo release, not a production TTS engine.

I’d love feedback from people interested in:
- tiny speech models
- vocoders
- local TTS
- efficient inference
- embedded speech synthesis
- improving small-model generalization

If people find it useful, I’m interested in putting more training budget into a stronger v2.

Model page:
owensong/Inflect-Nano-v1

posted an update 2 days ago

Post

6050

I just released Inflect-Nano-v1, an ultra-small 4.63 parameter text-to-speech model.

The main idea is simple: instead of only making the acoustic model tiny and relying on a larger external vocoder, Inflect-Nano-v1 keeps the complete text-to-waveform stack under 5M parameters.

Quick facts:
- 4.63M total inference parameters
- 3.46M acoustic model
- 1.17M vocoder
- 24 kHz audio
- English-only
- Single male voice
- Runs locally with a simple PyTorch inference script

Why I made it:
Most modern TTS models are much larger, and even many “small TTS” projects depend on a separate vocoder. I wanted to see how far a complete tiny TTS stack could be pushed while still producing usable speech.

It is not SOTA, and I am not trying to claim it competes with large TTS systems. The interesting part is the size-to-functionality ratio.

What works:
It can generate arbitrary English speech locally, and the model is small enough to be interesting for:

- local voice assistants
- embedded/edge experiments
- browser or WASM-style TTS exploration
- efficient inference research
- tiny-model baselines

Limitations:
The quality is still limited. It can sound robotic, stumble on difficult unseen text, and the vocoder is still a clear bottleneck. Long or unusual prompts are less reliable.

So I would frame this as a research/demo release, not a production TTS engine.

I’d love feedback from people interested in:
- tiny speech models
- vocoders
- local TTS
- efficient inference
- embedded speech synthesis
- improving small-model generalization

If people find it useful, I’m interested in putting more training budget into a stronger v2.

Model page:
owensong/Inflect-Nano-v1