Papers
arxiv:2604.19859

DR-Venus: Towards Frontier Edge-Scale Deep Research Agents with Only 10K Open Data

Published on Apr 21
· Submitted by
Sunhao Dai
on Apr 23
#3 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

DR-Venus-4B is a 4-billion-parameter deep research agent trained entirely on open data using agentic supervised fine-tuning and reinforcement learning with turn-level rewards to achieve superior performance on research benchmarks while maintaining edge-scale deployment advantages.

AI-generated summary

Edge-scale deep research agents based on small language models are attractive for real-world deployment due to their advantages in cost, latency, and privacy. In this work, we study how to train a strong small deep research agent under limited open-data by improving both data quality and data utilization. We present DR-Venus, a frontier 4B deep research agent for edge-scale deployment, built entirely on open data. Our training recipe consists of two stages. In the first stage, we use agentic supervised fine-tuning (SFT) to establish basic agentic capability, combining strict data cleaning with resampling of long-horizon trajectories to improve data quality and utilization. In the second stage, we apply agentic reinforcement learning (RL) to further improve execution reliability on long-horizon deep research tasks. To make RL effective for small agents in this setting, we build on IGPO and design turn-level rewards based on information gain and format-aware regularization, thereby enhancing supervision density and turn-level credit assignment. Built entirely on roughly 10K open-data, DR-Venus-4B significantly outperforms prior agentic models under 9B parameters on multiple deep research benchmarks, while also narrowing the gap to much larger 30B-class systems. Our further analysis shows that 4B agents already possess surprisingly strong performance potential, highlighting both the deployment promise of small models and the value of test-time scaling in this setting. We release our models, code, and key recipes to support reproducible research on edge-scale deep research agents.

Community

Paper submitter
edited 1 day ago

Key insights:

  1. We explore how to build strong edge-scale deep research agents with small language models under limited open-data settings, focusing on both data quality and data utilization.

  2. We introduce DR-Venus, a 4B deep research agent trained entirely on roughly 10K open-data. The training recipe combines agentic supervised fine-tuning with strict data cleaning and long-horizon trajectory resampling, followed by agentic reinforcement learning to improve reliability on complex research tasks.

  3. To make RL more effective for small agents, we design turn-level rewards based on information gain and format-aware regularization, improving supervision density and credit assignment across multi-step agent execution.

  4. Experiments show that DR-Venus-4B-RL establishes a new frontier among small deep research agents and consistently outperforms prior agentic systems at similar scales. Despite its compact 4B size, DR-Venus substantially narrows the gap to much larger 30B-class agents. Pass@K analysis further reveals that the capability ceiling of small deep research agents is surprisingly high, suggesting that test-time scaling can be an especially effective way to unlock the potential of edge-scale reasoning models.

GitHub: https://github.com/inclusionAI/DR-Venus
SFT code: https://github.com/inclusionAI/DR-Venus/tree/master/SFT
RL code: https://github.com/inclusionAI/DR-Venus/tree/master/RL
Inference code: https://github.com/inclusionAI/DR-Venus/tree/master/Inference
SFT model: https://huggingface.co/inclusionAI/DR-Venus-4B-SFT
RL model: https://huggingface.co/inclusionAI/DR-Venus-4B-RL
Collection: https://huggingface.co/collections/inclusionAI/dr-venus

Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/dr-venus-towards-frontier-edge-scale-deep-research-agents-with-only-10k-open-data-8384-3d29fa2b
Covers the executive summary, detailed methodology, and practical applications.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.19859
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.19859 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.19859 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.19859 in a Space README.md to link it from this page.

Collections including this paper 1