iVGR: Internalizing Visually Grounded Reasoning for MLLMs with Reinforcement Learning Paper • 2605.31096 • Published 20 days ago • 7
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer Paper • 2605.30940 • Published 20 days ago • 37
PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers Paper • 2605.26730 • Published 22 days ago • 16
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 22 days ago • 423
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 29 days ago • 204
OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation Paper • 2605.12038 • Published May 12 • 4
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 327
ArtHOI: Taming Foundation Models for Monocular 4D Reconstruction of Hand-Articulated-Object Interactions Paper • 2603.25791 • Published Mar 26 • 7
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 343