Seeing Culture: A Benchmark for Visual Reasoning and Grounding Paper • 2509.16517 • Published Sep 20, 2025 • 3
Can Large Language Models Understand, Reason About, and Generate Code-Switched Text? Paper • 2601.07153 • Published Jan 12
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG Paper • 2512.05959 • Published Dec 5, 2025
LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation Paper • 2604.00829 • Published 5 days ago • 5
LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation Paper • 2604.00829 • Published 5 days ago • 5
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability Paper • 2506.01789 • Published Jun 2, 2025 • 15
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16, 2024 • 32
Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models Paper • 2409.14785 • Published Sep 23, 2024