Running 81 Unlocking On-Policy Distillation for Any Model Family š 81 Improve model performance by transferring knowledge between different model families
Runtime error Featured 2.95k The Smol Training Playbook š 2.95k The secrets to building world-class LLMs