Toward Generalist Autonomous Research via Hypothesis-Tree Refinement
Paper • 2606.11926 • Published • 117
None defined yet.
Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation
AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents