Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models Paper • 2605.08472 • Published 29 days ago • 5
FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments Paper • 2604.25135 • Published Apr 28 • 12