Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 11 days ago • 179
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing Paper • 2602.01851 • Published 20 days ago • 16
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6