Yu Zhang's picture

Yu Zhang

AaronZ345

·

https://aaronz345.github.io

AI & ML interests

Multi-Modal Generative AI (Spatial Audio/Music/Singing/Speech).

Recent Activity

authored a paper 2 days ago

ALIVE: Animate Your World with Lifelike Audio-Video Generation

authored a paper 2 days ago

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

authored a paper 2 days ago

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

View all activity

Organizations

upvoted 3 papers 2 days ago

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Paper • 2605.30940 • Published 5 days ago • 32

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Paper • 2605.28618 • Published 7 days ago • 27

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Paper • 2605.30993 • Published 5 days ago • 52

upvoted a paper 9 months ago

ASAudio: A Survey of Advanced Spatial Audio Research

Paper • 2508.10924 • Published Aug 8, 2025 • 1

upvoted 7 papers about 1 year ago

Robust Singing Voice Transcription Serves Synthesis

Paper • 2405.09940 • Published May 16, 2024 • 1

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Paper • 2502.12572 • Published Feb 18, 2025 • 2

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting

Paper • 2504.20630 • Published Apr 29, 2025 • 9

StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis

Paper • 2312.10741 • Published Dec 17, 2023 • 1

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

Paper • 2409.13832 • Published Sep 20, 2024 • 1

Versatile Framework for Song Generation with Prompt-based Control

Paper • 2504.19062 • Published Apr 27, 2025 • 6

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

Paper • 2409.15977 • Published Sep 24, 2024 • 2