Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 7 days ago • 66
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 11 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 11 days ago • 7