TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Paper Demo Collection PyPI License

image


A unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment.


Text-Acoustic Dual-Alignment Large Language Model

TADA is a unified speech-language model that synchronizes speech and text into a single, cohesive stream via 1:1 alignment. By leveraging a novel tokenizer and architectural design, TADA achieves high-fidelity synthesis and generation with a fraction of the computational overhead required by traditional models.

⭐️ arxiv: https://arxiv.org/abs/2602.23068
⭐️ demo: https://huggingface.co/spaces/HumeAI/tada
⭐️ github: https://github.com/HumeAI/tada
⭐️ blog post:

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using HumeAI/tada-codec 1

Collection including HumeAI/tada-codec

Paper for HumeAI/tada-codec