Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty Paper • 2603.15500 • Published 5 days ago • 11
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization Paper • 2602.23008 • Published 23 days ago • 35