MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 277
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 25 days ago • 108
To read... eventually Collection A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics. • 169 items • Updated Jun 30, 2025 • 6
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 103
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 261
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models Paper • 2605.11887 • Published 12 days ago • 9
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26, 2025 • 60
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 34
WithIn US AI (((GGUF MODELS))) Collection LLM MODELS TRAINED, FINE-TUNED, MERGED BY (WITHIN US AI) • 21 items • Updated about 1 hour ago • 5
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 32 items • Updated 11 days ago • 83
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27, 2024 • 48