AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research
Abstract
AgentCPM-Report presents a lightweight local solution for deep research report generation using a Writing As Reasoning Policy framework and multi-stage agentic training to enhance small models' reasoning and outline evolution capabilities.
Generating deep research reports requires large-scale information acquisition and the synthesis of insight-driven analysis, posing a significant challenge for current language models. Most existing approaches follow a plan-then-write paradigm, whose performance heavily depends on the quality of the initial outline. However, constructing a comprehensive outline itself demands strong reasoning ability, causing current deep research systems to rely almost exclusively on closed-source or online large models. This reliance raises practical barriers to deployment and introduces safety and privacy concerns for user-authored data. In this work, we present AgentCPM-Report, a lightweight yet high-performing local solution composed of a framework that mirrors the human writing process and an 8B-parameter deep research agent. Our framework uses a Writing As Reasoning Policy (WARP), which enables models to dynamically revise outlines during report generation. Under this policy, the agent alternates between Evidence-Based Drafting and Reasoning-Driven Deepening, jointly supporting information acquisition, knowledge refinement, and iterative outline evolution. To effectively equip small models with this capability, we introduce a Multi-Stage Agentic Training strategy, consisting of cold-start, atomic skill RL, and holistic pipeline RL. Experiments on DeepResearch Bench, DeepConsult, and DeepResearch Gym demonstrate that AgentCPM-Report outperforms leading closed-source systems, with substantial gains in Insight.
Community
AgentCPM-Report是由THUNLP、中国人民大学RUCBM和ModelBest联合开发的开源大语言模型智能体。它基于MiniCPM4.1 80亿参数基座模型,接受用户指令作为输入,自主生成长篇报告。其有以下亮点:
- 极致效能,以小博大:通过平均40轮的深度检索与近100轮的思维链推演,实现对信息的全方位挖掘与重组,让端侧模型也能产出逻辑严密、洞察深刻的万字长文,在深度调研任务上以8B参数规模达成与顶级闭源系统的性能对标。
- 物理隔绝,本地安全:专为高隐私场景设计,支持完全离线的本地化敏捷部署,彻底杜绝云端泄密风险。基于我们的 UltraRAG 框架,它能高效挂载并理解您的本地私有知识库,让核心机密数据在“不出域”的前提下,安全地转化为极具价值的专业决策报告。
GitHub:https://github.com/OpenBMB/AgentCPM
Huggingface:https://huggingface.co/openbmb/AgentCPM-Report
ModelScope:https://modelscope.cn/models/OpenBMB/AgentCPM-Report
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Step-DeepResearch Technical Report (2025)
- FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents (2026)
- DeepSynth-Eval: Objectively Evaluating Information Consolidation in Deep Survey Writing (2026)
- Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis (2026)
- IDRBench: Interactive Deep Research Benchmark (2026)
- IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning (2026)
- Orchestrating Specialized Agents for Trustworthy Enterprise RAG (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper