DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation Paper • 2511.19365 • Published Nov 24, 2025 • 63
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models Paper • 2510.25257 • Published Oct 29, 2025 • 4
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models Paper • 2510.25257 • Published Oct 29, 2025 • 4
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation Paper • 2303.13399 • Published Mar 23, 2023
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer Paper • 2407.17140 • Published Jul 24, 2024 • 2
iSegMan: Interactive Segment-and-Manipulate 3D Gaussians Paper • 2505.11934 • Published May 17, 2025 • 1
iSegMan: Interactive Segment-and-Manipulate 3D Gaussians Paper • 2505.11934 • Published May 17, 2025 • 1
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22, 2025 • 89
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16, 2024 • 30