MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation Paper • 2508.19320 • Published Aug 26, 2025 • 29
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Dec 31, 2025 • 557
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4, 2025 • 191
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 30 items • Updated 7 days ago • 126