Demystifying When Pruning Works via Representation Hierarchies Paper • 2603.24652 • Published 6 days ago • 15
Demystifying When Pruning Works via Representation Hierarchies Paper • 2603.24652 • Published 6 days ago • 15
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model Paper • 2603.22281 • Published 20 days ago • 17
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 711
Making Large Language Models Efficient Dense Retrievers Paper • 2512.20612 • Published Dec 23, 2025 • 2
Understanding and Harnessing Sparsity in Unified Multimodal Models Paper • 2512.02351 • Published Dec 2, 2025 • 2
Understanding and Harnessing Sparsity in Unified Multimodal Models Paper • 2512.02351 • Published Dec 2, 2025 • 2 • 2
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12, 2025 • 71
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 60
Dense Video Understanding with Gated Residual Tokenization Paper • 2509.14199 • Published Sep 17, 2025 • 3