| pipeline_tag: image-text-to-text | |
| library_name: transformers | |
| # VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | |
| This repository contains the code for the model described in the paper [VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning](https://huggingface.co/papers/2505.12081). | |
| Code: https://github.com/dvlab-research/VisionReasoner |