VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
Paper
• 2501.13106 • Published
• 90
Auden is an open research initiative for audio and multimodal understanding. We publish reproducible code, curated datasets, model checkpoints, and interactive demos to enable transparent evaluation and strong, reusable baselines.