OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper • 2510.10689 • Published Oct 12 • 46
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published Aug 16 • 71
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published Jul 24 • 85
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4 • 132
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models Paper • 2310.00746 • Published Oct 1, 2023 • 1