CPPO: Contrastive Perception for Vision Language Policy Optimization Paper • 2601.00501 • Published 7 days ago • 6
From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model Paper • 2512.05277 • Published Dec 4, 2025 • 5
CASP: Compression of Large Multimodal Models Based on Attention Sparsity Paper • 2503.05936 • Published Mar 7, 2025 • 2
GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation Paper • 2403.19754 • Published Mar 28, 2024
Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes Paper • 2509.06266 • Published Sep 8, 2025 • 11