ODA-Scored Collection ODA-Scored-Data by implemented multiple data scores. • 2 items • Updated 2 days ago • 3
Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training Paper • 2603.07223 • Published 17 days ago • 13
MMFineReason Collection High-quality STEM reasoning dataset for Multimodal LLM post-training. • 8 items • Updated 22 days ago • 22
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods Paper • 2601.21821 • Published Jan 29 • 61
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility Paper • 2601.17027 • Published Jan 17 • 42
ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch Paper • 2601.13606 • Published Jan 20 • 11
ODA-Mixture Collection High-quality mixture datasets for post-training covering multiple domains. • 7 items • Updated Jan 17 • 5
ODA-Math Collection High-quality mathematical datasets for post training. • 5 items • Updated Jan 17 • 1
Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets Paper • 2601.09733 • Published Dec 30, 2025 • 9
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published Dec 19, 2025 • 52
OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value Paper • 2512.14051 • Published Dec 16, 2025 • 47
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Paper • 2512.01816 • Published Dec 1, 2025 • 94
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5, 2025 • 23
ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning Paper • 2509.21070 • Published Sep 25, 2025 • 9
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26, 2025 • 153
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published Jul 23, 2025 • 37
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once Paper • 2507.10541 • Published Jul 14, 2025 • 30