Penghui Qi
QPHutu
AI & ML interests
None yet
Organizations
LLM Agent
-
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Paper • 2505.19253 • Published • 34 -
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Paper • 2505.20411 • Published • 96 -
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
Paper • 2505.21497 • Published • 109 -
Agentic Reinforced Policy Optimization
Paper • 2507.19849 • Published • 161
LLM Self-Play
LLM Agent
-
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Paper • 2505.19253 • Published • 34 -
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Paper • 2505.20411 • Published • 96 -
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
Paper • 2505.21497 • Published • 109 -
Agentic Reinforced Policy Optimization
Paper • 2507.19849 • Published • 161
models 0
None public yet
datasets 0
None public yet