-
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32 -
MaxyLee/DeepPerception
Image-Text-to-Text • 8B • Updated • 9 • 2 -
MaxyLee/KVG-Bench
Viewer • Updated • 1.34k • 15 -
MaxyLee/DeepPerception-FGVR
Image-Text-to-Text • 8B • Updated • 4
Xinyu Ma
MaxyLee
AI & ML interests
None yet
Recent Activity
upvoted a paper about 7 hours ago
Imagination Helps Visual Reasoning, But Not Yet in Latent Space upvoted a paper about 21 hours ago
GLM-5: from Vibe Coding to Agentic Engineering upvoted a paper about 1 month ago
MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios? Organizations
None yet