SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14, 2025 • 51
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity Paper • 2505.11107 • Published May 16, 2025 • 29
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation Paper • 2501.16609 • Published Jan 28, 2025 • 7 • 2
Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model Paper • 2501.12206 • Published Jan 21, 2025 • 4