When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning Paper • 2602.08236 • Published Feb 9 • 9
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 72
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 51
GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL Paper • 2602.22190 • Published 26 days ago • 16
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published 25 days ago • 44
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 11 days ago • 63
SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read Paper • 2602.22426 • Published 25 days ago
SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read Paper • 2602.22426 • Published 25 days ago
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 5 days ago • 121
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 5 days ago • 121
Reliable and Responsible Foundation Models: A Comprehensive Survey Paper • 2602.08145 • Published Feb 4 • 8
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 72
MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution Paper • 2602.07529 • Published Feb 7
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 51
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 72
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 60
DSGym: A Holistic Framework for Evaluating and Training Data Science Agents Paper • 2601.16344 • Published Jan 22 • 12