LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment Paper • 2604.11689 • Published 9 days ago • 19
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 23 days ago • 144
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published Jan 16 • 67
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 102
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published Jan 22 • 92