Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding? Paper • 2606.08063 • Published 6 days ago • 20
Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning Paper • 2606.10968 • Published 3 days ago • 41
Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models Paper • 2505.18536 • Published May 24, 2025 • 18