Qwen/Qwen3.5-397B-A17B Image-Text-to-Text • 403B • Updated about 20 hours ago • 390k • • 965
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Text Generation • 32B • Updated 3 days ago • 853k • 643
Running on CPU Upgrade Featured 3.01k The Smol Training Playbook 📚 3.01k The secrets to building world-class LLMs
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 28 items • Updated 6 days ago • 119