2 31 5

Yifan Zeng

yokey

https://xhmy.github.io/

AI & ML interests

Large Language Model, Agentic AI, Deep Learning

Recent Activity

upvoted a paper 13 days ago

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

authored a paper 14 days ago

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

upvoted a paper 14 days ago

Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism

View all activity

Organizations

None yet

upvoted a paper 13 days ago

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Paper • 2605.30611 • Published 19 days ago • 193

upvoted 2 papers 14 days ago

Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism

Paper • 2605.30852 • Published 18 days ago • 10

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

Paper • 2605.24202 • Published 25 days ago • 17

upvoted a paper 28 days ago

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Paper • 2605.14212 • Published May 14 • 18

upvoted a paper about 1 month ago

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

Paper • 2605.11136 • Published May 11 • 11

upvoted a paper 3 months ago

EVA: Efficient Reinforcement Learning for End-to-End Video Agent

Paper • 2603.22918 • Published Mar 24 • 44

upvoted a paper 7 months ago

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 171

upvoted a paper 9 months ago

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published Sep 29, 2025 • 142

upvoted 4 papers 10 months ago

upvoted 5 papers about 1 year ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265

WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue

Paper • 2506.01881 • Published Jun 2, 2025 • 6

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29, 2025 • 93

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7, 2025 • 26

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Paper • 2503.05132 • Published Mar 7, 2025 • 57

upvoted 2 papers over 1 year ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12, 2025 • 77

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

upvoted an article over 1 year ago

Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

sdiazlor

•

Feb 10, 2025

• 60

Yifan Zeng

AI & ML interests

Recent Activity

Organizations

yokey's activity

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset