Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 7 days ago • 65
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 10 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 11 days ago • 7
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 2 days ago • 25
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 24 days ago • 133
Trinity Collection Collection of Arcee AI models in the Trinity family • 8 items • Updated 15 days ago • 21
Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated 3 days ago • 31
BERT Hash Nano Models Collection Set of BERT models with a modified embeddings layer • 4 items • Updated 4 days ago • 9
TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments Paper • 2510.01179 • Published Oct 1 • 25
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 26 items • Updated about 5 hours ago • 128
Tfree-HAT-7b-pretrained Collection Tokenizer free models based on Hierarchical Autoregressive Transformer (https://arxiv.org/abs/2501.10322) trained from scratch. • 2 items • Updated Aug 1 • 10