DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 17 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 422 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 179 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 74
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 4.99k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 31.5k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 934 • 35 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 8
DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated Mar 1 • 146k • 17 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 422 allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 179 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated Feb 28 • 74
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 4.99k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 31.5k • 65 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 934 • 35 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 8