jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition Paper • 2605.08384 • Published May 8 • 11
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published May 4 • 350
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published Apr 30 • 73
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 108
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published Apr 20 • 46
view article Article Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs +3 lapp0, LouisCastricato, ScottieFox, shahbuland, xAesthetics • Apr 9 • 30
Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments Paper • 2602.11964 • Published Feb 12 • 13
Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders Paper • 2412.09586 • Published Dec 12, 2024 • 6
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation Paper • 2510.04290 • Published Oct 5, 2025 • 21
story writing favourites Collection Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 19 items • Updated 4 days ago • 112