view article Article Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs +3 10 days ago • 28
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 21 days ago • 143
view article Article How I contributed a new model to the Transformers library using Codex 19 days ago • 46
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published Mar 18 • 13
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published Feb 12 • 61
view article Article Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model Feb 4 • 28
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published Jan 29 • 74