Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer Paper • 2605.30940 • Published 5 days ago • 32
Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios Paper • 2605.28618 • Published 7 days ago • 27
SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue Paper • 2605.30993 • Published 5 days ago • 52
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching Paper • 2502.12572 • Published Feb 18, 2025 • 2
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting Paper • 2504.20630 • Published Apr 29, 2025 • 9
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis Paper • 2312.10741 • Published Dec 17, 2023 • 1
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks Paper • 2409.13832 • Published Sep 20, 2024 • 1
Versatile Framework for Song Generation with Prompt-based Control Paper • 2504.19062 • Published Apr 27, 2025 • 6
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control Paper • 2409.15977 • Published Sep 24, 2024 • 2