MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 21 days ago • 51
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published 27 days ago • 13
nvidia/nemotron-speech-streaming-en-0.6b Automatic Speech Recognition • Updated 4 days ago • 10.5k • 458
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated about 16 hours ago • 137