ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model
Paper • 2603.22281 • Published • 16
None defined yet.
Structural Graph Probing of Vision-Language Models
Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks