SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 18 days ago • 191
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 27 days ago • 166
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 11 days ago • 185
MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 16 days ago • 118
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 23 days ago • 111
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 12 days ago • 111
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published 12 days ago • 91
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 29 days ago • 84
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 12 days ago • 76
PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion Paper • 2605.23902 • Published 8 days ago • 41
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era Paper • 2503.12329 • Published Mar 16, 2025 • 28