AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper โข 2502.01341 โข Published Feb 3, 2025 โข 39
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper โข 2412.04626 โข Published Dec 5, 2024 โข 13
StarVector: Generating Scalable Vector Graphics Code from Images Paper โข 2312.11556 โข Published Dec 17, 2023 โข 37
Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence Paper โข 1712.08859 โข Published Dec 23, 2017 โข 1