view article Article NEO-unify: Building Native Multimodal Unified Models End to End 2 days ago • 65
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17, 2025 • 47
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper • 2510.27492 • Published Oct 30, 2025 • 86
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Paper • 2510.26794 • Published Oct 30, 2025 • 27
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 75
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published Mar 18, 2025 • 26
WebSailor: Navigating Super-human Reasoning for Web Agent Paper • 2507.02592 • Published Jul 3, 2025 • 124
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16, 2025 • 27
AnyI2V: Animating Any Conditional Image with Motion Control Paper • 2507.02857 • Published Jul 3, 2025 • 13