π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 9 days ago • 102
Running Agents 6 PEFT Method Comparison ⚖ 6 Explore PEFT method trade-offs with interactive Pareto plots