view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3 • 289
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 25 days ago • 93
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization Paper • 2510.25616 • Published Oct 29 • 96
π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published Oct 29 • 64
Exploring Conditions for Diffusion models in Robotic Control Paper • 2510.15510 • Published Oct 17 • 39
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published Oct 27 • 120
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27 • 174