view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day Dec 8, 2025 • 52
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 293
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 51
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 57
Running on CPU Upgrade Featured 2.97k The Smol Training Playbook 📚 2.97k The secrets to building world-class LLMs