X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 6 days ago • 33
CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition Paper • 2605.19995 • Published 19 days ago • 34
From Web to Pixels: Bringing Agentic Search into Visual Perception Paper • 2605.12497 • Published 26 days ago • 14
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published Oct 9, 2025 • 65
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Oct 20, 2025 • 100