OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 11 days ago • 70
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 22 days ago • 876
view article Article Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents 23 days ago • 34
\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published Mar 9 • 27
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use Paper • 2504.07981 • Published Apr 4, 2025 • 5
view article Article The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics Mar 16 • 29
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published Mar 12 • 65
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 3 days ago • 274