DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published 6 days ago • 35
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning Paper • 2601.09088 • Published 7 days ago • 54
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published Dec 18, 2025 • 44
AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Paper • 2402.07625 • Published Feb 12, 2024 • 16
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published Dec 15, 2025 • 101