BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering Paper • 2606.17049 • Published 4 days ago • 27
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV Paper • 2605.26244 • Published 25 days ago • 38
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published 24 days ago • 143
SkillOpt: Executive Strategy for Self-Evolving Agent Skills Paper • 2605.23904 • Published 28 days ago • 238
MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models Paper • 2605.14906 • Published May 14 • 78
ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration Paper • 2605.03042 • Published May 4 • 135