DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning Paper • 2606.08035 • Published 7 days ago • 15
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation Paper • 2603.19039 • Published Mar 19 • 51
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding Paper • 2506.05551 • Published Jun 5, 2025 • 5