ProactiveBench: Benchmarking Proactiveness in Multimodal Large Language Models Paper • 2603.19466 • Published 8 days ago • 39
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation Paper • 2603.19039 • Published 8 days ago • 47
Specificity-aware reinforcement learning for fine-grained open-world classification Paper • 2603.03197 • Published 24 days ago • 16
How to Take a Memorable Picture? Empowering Users with Actionable Feedback Paper • 2602.21877 • Published about 1 month ago • 16
Large Multimodal Models as General In-Context Classifiers Paper • 2602.23229 • Published 29 days ago • 26
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 170
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals Paper • 2505.21062 • Published May 27, 2025 • 4
Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals Paper • 2505.21062 • Published May 27, 2025 • 4 • 1