Alex Ngai
alexngai
AI & ML interests
None yet
Recent Activity
commented on
a paper
10 days ago
Toward Training Superintelligent Software Agents through Self-Play SWE-RL
updated
a collection
10 days ago
Self-Improving Agents
upvoted
a
paper
23 days ago
Towards a Science of Scaling Agent Systems
Organizations
RL Agents
Memory/Search/Retrieval/RAG
-
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 102 -
Chain-of-Retrieval Augmented Generation
Paper • 2501.14342 • Published • 58 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 157 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 160
Automated Research
-
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Paper • 2404.07738 • Published • 2 -
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
Paper • 2408.06292 • Published • 127 -
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51
General LLM
Code LLMs
Self-Improving Agents
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Paper • 2410.22304 • Published • 18 -
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Paper • 2410.19609 • Published • 18 -
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Paper • 2411.00412 • Published • 10 -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
Codegen Benchmarks
Agent Eval
Autonomous Research
-
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Paper • 2409.04109 • Published • 48 -
Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Paper • 2410.13185 • Published • 5 -
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Paper • 2404.07738 • Published • 2 -
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
Paper • 2409.07703 • Published • 67
Self-Critique
Test-Time Compute/Optimal Scaling
-
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper • 2410.22480 • Published -
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper • 2501.02497 • Published • 45 -
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Paper • 2412.14135 • Published -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99
Automated SWE
Multi-Agent
-
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
Paper • 2410.08328 • Published -
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Paper • 2305.17390 • Published • 3 -
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper • 2501.13200 • Published • 69 -
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems
Paper • 2502.11098 • Published • 13
Automated ML
Latent Reasoning
Agent Eval
RL Agents
Autonomous Research
-
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Paper • 2409.04109 • Published • 48 -
Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Paper • 2410.13185 • Published • 5 -
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Paper • 2404.07738 • Published • 2 -
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
Paper • 2409.07703 • Published • 67
Memory/Search/Retrieval/RAG
-
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 102 -
Chain-of-Retrieval Augmented Generation
Paper • 2501.14342 • Published • 58 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 157 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 160
Self-Critique
Automated Research
-
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
Paper • 2404.07738 • Published • 2 -
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
Paper • 2408.06292 • Published • 127 -
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 16 -
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51
Test-Time Compute/Optimal Scaling
-
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper • 2410.22480 • Published -
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper • 2501.02497 • Published • 45 -
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Paper • 2412.14135 • Published -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 99
General LLM
Automated SWE
Code LLMs
Multi-Agent
-
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
Paper • 2410.08328 • Published -
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
Paper • 2305.17390 • Published • 3 -
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper • 2501.13200 • Published • 69 -
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems
Paper • 2502.11098 • Published • 13
Self-Improving Agents
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning
Paper • 2410.22304 • Published • 18 -
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Paper • 2410.19609 • Published • 18 -
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
Paper • 2411.00412 • Published • 10 -
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning
Paper • 2410.02052 • Published • 9
Automated ML
Codegen Benchmarks