MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft Paper • 2605.30931 • Published 8 days ago • 11
sentence-transformers/all-mpnet-base-v2 Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 35.3M • • 1.3k
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 12 days ago • 102
IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools Paper • 2605.20682 • Published 17 days ago • 83
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 25 days ago • 195
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 24 days ago • 270
Efficient Pre-Training with Token Superposition Paper • 2605.06546 • Published about 1 month ago • 46
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published about 1 month ago • 233
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published about 1 month ago • 112