LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 2 days ago • 118
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 8 days ago • 73
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis Paper • 2505.16834 • Published May 22, 2025
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR Paper • 2508.07534 • Published Aug 11, 2025 • 2
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published Apr 29 • 52