Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 11 days ago • 117
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 17 days ago • 192
CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage Paper • 2605.15597 • Published about 1 month ago • 11
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 145
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 195
GridProbe: Posterior-Probing for Adaptive Test-Time Compute in Long-Video VLMs Paper • 2605.10762 • Published May 11 • 3
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 506
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published Mar 4 • 211
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 266
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs Paper • 2602.10388 • Published Feb 11 • 245
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents Paper • 2602.07274 • Published Feb 6 • 210