Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 4 days ago • 21
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 4 days ago • 21
AudioVisual-Caption/ASID-Captioner-7B Image-Text-to-Text • 9B • Updated 16 days ago • 188 • 5
AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 16 days ago • 2.64k • 37
AudioVisual-Caption/ASID-Captioner-7B Image-Text-to-Text • 9B • Updated 16 days ago • 188 • 5
AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 16 days ago • 2.64k • 37
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published Feb 13 • 54
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published Feb 13 • 54
AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 16 days ago • 2.64k • 37
AudioVisual-Caption/ASID-Captioner-7B Image-Text-to-Text • 9B • Updated 16 days ago • 188 • 5
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9, 2025 • 110