Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models Paper • 2606.11167 • Published 14 days ago • 5
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 9 days ago • 110
DFlash Collection Block Diffusion for Flash Speculative Decoding • 22 items • Updated 9 days ago • 132
view article Article **ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models?** lightonai • Feb 19 • 22
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
Quantized Qwen3.5 Collection Verified models. Compatible with Transformers v5.3 and vLLM v0.16.1rc1 (nightly). Under evaluation. • 9 items • Updated Mar 12 • 9
NextCoder Collection NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 6 items • Updated Jul 9, 2025 • 79
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 780
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks Paper • 2505.16901 • Published May 22, 2025 • 48
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated Apr 15 • 119
view article Article Learn the Hugging Face Kernel Hub in 5 Minutes +5 drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb • Jun 12, 2025 • 164
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 487
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 614