Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation Paper • 2606.06712 • Published 6 days ago • 1