Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions Paper • 2606.02859 • Published 6 days ago • 8
Benchmarks are Not Enough: RAMP for Runtime Assessing of Agentic Models in Production Systems Paper • 2605.27492 • Published 12 days ago • 24
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 4 days ago • 37
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization Paper • 2605.15824 • Published 23 days ago • 64
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published May 4 • 348
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 136 items • Updated 11 days ago • 119