Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents Paper • 2502.04223 • Published Feb 6, 2025 • 10
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 45.7k • 1.6k
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models Paper • 2502.18443 • Published Feb 25, 2025 • 9
A Token-level Text Image Foundation Model for Document Understanding Paper • 2503.02304 • Published Mar 4, 2025 • 4