DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper โข 2601.09688 โข Published 6 days ago โข 116
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper โข 2509.01215 โข Published Sep 1, 2025 โข 50
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper โข 2412.07589 โข Published Dec 10, 2024 โข 48
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper โข 2412.08443 โข Published Dec 11, 2024 โข 38
DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing Paper โข 2312.07409 โข Published Dec 12, 2023 โข 23
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Paper โข 2312.07536 โข Published Dec 12, 2023 โข 18
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs Paper โข 2307.08581 โข Published Jul 17, 2023 โข 28
JourneyDB: A Benchmark for Generative Image Understanding Paper โข 2307.00716 โข Published Jul 3, 2023 โข 19