view post Post 5166 We collaborated with Hugging Face to enable you to train MoE models 12× faster with 35% less VRAM via our new Triton kernels (no accuracy loss). 🤗Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe See translation 1 reply · 🔥 29 29 🤗 5 5 + Reply
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 118
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published Jan 12 • 43
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 53
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 38