Logics-MLLM/Logics-STEM-SFT-Dataset-Open-1.6M Viewer • Updated 8 days ago • 1.07M • 2.63k • 15
WebOrganizer/TopicClassifier-NoURL Text Classification • 0.1B • Updated Feb 19, 2025 • 52.3k • 14
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12, 2025 • 45
Running on CPU Upgrade Featured 2.93k The Smol Training Playbook 📚 2.93k The secrets to building world-class LLMs
Standard-format-preference-dataset Collection We collect the open-source datasets and process them into the standard format. • 14 items • Updated May 8, 2024 • 26