Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2605.13527

Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Paper • 2605.15301 • Published 7 days ago • 20
MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Paper • 2605.15871 • Published 6 days ago • 13
Look Before You Leap: Autonomous Exploration for LLM Agents

Paper • 2605.16143 • Published 6 days ago • 7

MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114

about 3 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published Mar 26 • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published Mar 27 • 144
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 156
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published Mar 29 • 147

about 23 hours ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 7 days ago • 104
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Paper • 2605.15128 • Published 7 days ago • 60
Orchard: An Open-Source Agentic Modeling Framework

Paper • 2605.15040 • Published 7 days ago • 18
MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114

ComputerUseAgent

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Paper • 2603.26648 • Published Mar 27 • 43
OpenGame: Open Agentic Coding for Games

Paper • 2604.18394 • Published Apr 20 • 81
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published Mar 25 • 98
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?

Paper • 2604.27419 • Published 21 days ago • 13

about 16 hours ago

PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Paper • 2601.09636 • Published Jan 14 • 8
LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark

Paper • 2504.13805 • Published Apr 18, 2025 • 11
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Paper • 2604.11784 • Published Apr 13 • 143
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published Mar 25 • 47

Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Paper • 2605.15301 • Published 7 days ago • 20
MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Paper • 2605.15871 • Published 6 days ago • 13
Look Before You Leap: Autonomous Exploration for LLM Agents

Paper • 2605.16143 • Published 6 days ago • 7

about 23 hours ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 7 days ago • 104
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Paper • 2605.15128 • Published 7 days ago • 60
Orchard: An Open-Source Agentic Modeling Framework

Paper • 2605.15040 • Published 7 days ago • 18
MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114

MMSkills: Towards Multimodal Skills for General Visual Agents

Paper • 2605.13527 • Published 7 days ago • 114

ComputerUseAgent

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Paper • 2603.26648 • Published Mar 27 • 43
OpenGame: Open Agentic Coding for Games

Paper • 2604.18394 • Published Apr 20 • 81
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published Mar 25 • 98
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?

Paper • 2604.27419 • Published 21 days ago • 13

about 3 hours ago

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published Mar 26 • 155
TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published Mar 27 • 144
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 156
LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published Mar 29 • 147

about 16 hours ago

PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Paper • 2601.09636 • Published Jan 14 • 8
LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark

Paper • 2504.13805 • Published Apr 18, 2025 • 11
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Paper • 2604.11784 • Published Apr 13 • 143
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published Mar 25 • 47

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs