LatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM Agents Paper • 2606.06087 • Published 11 days ago • 62
SWE-Explore: Benchmarking How Coding Agents Explore Repositories Paper • 2606.07297 • Published 10 days ago • 110
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published 15 days ago • 36
π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 27 days ago • 105
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published Apr 14 • 101
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 110
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks Paper • 2604.01487 • Published Apr 1 • 10
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks Paper • 2604.01487 • Published Apr 1 • 10
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks Paper • 2604.01487 • Published Apr 1 • 10
SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training Paper • 2603.18079 • Published Mar 18 • 1
SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training Paper • 2603.18079 • Published Mar 18 • 1