File size: 5,902 Bytes

8f4e5b9
64c3b50
6298130
 
 
 
 
 
 
8f4e5b9
 
71513cb
f9eadb2
71513cb
8f4e5b9
222d18b
71513cb
6298130
71513cb
 
222d18b
01616ee
6298130
71513cb
222d18b
6298130
 
8f4e5b9
 
71513cb
6298130
 
 
 
 
 
 
 
 
 
 
f9eadb2
8f4e5b9
6298130
8f4e5b9
 
222d18b
8f4e5b9
 
 
6298130
 
 
 
8f4e5b9
 
 
 
 
71513cb
8f4e5b9
 
 
222d18b
8f4e5b9
222d18b
8f4e5b9
6298130
8f4e5b9
222d18b
71513cb
01616ee
71513cb
222d18b
01616ee
 
222d18b
8f4e5b9
01616ee
 
f9eadb2
01616ee
 
8f4e5b9
01616ee
f9eadb2
8f4e5b9
6298130
8f4e5b9
 
01616ee
8f4e5b9
01616ee
 
 
8f4e5b9
01616ee
8f4e5b9
 
6298130
71513cb
8f4e5b9
 
 
 
222d18b
 
 
 
 
 
 
8f4e5b9
222d18b
8f4e5b9
222d18b
8f4e5b9
6298130

---
license: mit
tags:
  - image-to-image
  - reflection-removal
  - highlight-removal
  - computer-vision
  - dinov3
  - surgical-imaging
---

# UnReflectAnything

[![Project](https://img.shields.io/badge/Project-Webpage-ff611b?logo=googlehome&logoColor=ff611b)](https://alberto-rota.github.io/UnReflectAnything/)
[![PyPI](https://img.shields.io/pypi/v/unreflectanything?color=76b1f3&label=pip%20install&logo=python&logoColor=76b1f3)](https://pypi.org/project/unreflectanything/)
[![Paper](https://img.shields.io/badge/Paper-arXiv-B31B1B?logo=arxiv&logoColor=B31B1B)](https://arxiv.org/abs/2512.09583)
[![Demo](https://img.shields.io/badge/Demo-HF%20-FFD21E?logo=huggingface&logoColor=FFD21E)](https://huggingface.co/spaces/AlbeRota/UnReflectAnything)
[![Modelcard](https://img.shields.io/badge/Model%20Card-HF%20-FFD21E?logo=huggingface&logoColor=FFD21E)](https://huggingface.co/AlbeRota/UnReflectAnything)
[![Wiki](https://img.shields.io/badge/API-Wiki-9187FF?logo=wikipedia&logoColor=9187FF)](https://github.com/alberto-rota/UnReflectAnything/wiki)
[![Licence](https://img.shields.io/badge/MIT-License-1E811F)](https://mit-license.org/)

UnReflectAnything inputs any RGB image and removes specular highlights, returning a clean diffuse-only outputs. We trained UnReflectAnything by synthetizing specularities and supervising in DINOv3 feature space.

UnReflectAnything works on both natural indoor and surgical/endoscopic domain data. 

---

## Architecture
![Architecture](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/architecture.png)



* **<font color="#a001e0">Encoder</font> ($\mathit{\textcolor{a001e0}{E}}$ )**: Processes the input image $\mathbf{I}$ to extract a rich latent representation, $\mathbf{F}_\ell$. This is the off-the-shelf pretrained [DINOv3-large](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m) 

* **<font color="#0167ff">Reflection Predictor</font> ($\mathit{\textcolor{0167ff}{H}}$ )**: Predicts a soft highlight mask (**H**), identifying areas of specular highlights.

* **Masking Operation</font> ($\mathit{P}$ )**: A binary mask **P** is derived from the prediction and applied to the feature map: $(1-\mathbf{P}) \odot \mathbf{F}_\ell$. This removes features contaminated by reflections, leaving "holes" in the data.

* **<font color="#23ac2c">Token Inpainter</font> ($\mathit{\textcolor{23ac2c}{T}}$ )**: Acts as a neural in-painter. It processes the masked features and uses the surrounding clean context prior and a learned mask token to synthesize the missing information in embedding space, producing the completed feature map $\mathbf{F}_{\text{comp}}$.

* **<font color="#ff7700">Decoder</font> ($\mathit{\textcolor{ff7700}{D}}$ )**: Project the completed features back into the pixel space to generate the final, reflection-free image $\mathbf{I}_{\text{diff}}$.

---

## Training Strategy
We train UnReflectAnything with **Synthetic Specular Supervision** by inferring 3D geometry from [MoGe-2](https://wangrc.site/MoGe2Page/) and rendering highlights with a Blinn-Phong reflection model. We randomly sample the light source position in 3D space at every training iteration enhance etherogeneity.

![SupervisionExamples](https://raw.githubusercontent.com/alberto-rota/UnReflectAnything/refs/heads/main/assets/highlights.png)

We train the model in two stages
1.  **DPT Decoder Pre-Training**: The **<font color="#ff7700">Decoder</font>** is first pre-trained in an autoencoder configuration ($\min_{\theta} \mathcal{L}(M_{\theta}(\mathbf{I}), \mathbf{I})$) to ensure it can reconstruct realistic RGB textures from the DINOV3 latent space.
2.  **End-to-End Refinement**: The full pipeline is then trained to predict reflection masks from $\mathit{\textcolor{0167ff}{H}}$, and fill them using the **<font color="#38761D">Token Inpainter</font>**, ensuring the final output is both visually consistent and physically accurate. The decoder is also fine-tuned at this stage



## Weights
Install the API and CLI on a **Python>=3.11** environment with 
```bash
pip install unreflectanything
```
then run 
```bash
unreflectanything download --weights
```
to download the `.pth` weights in the package cache dir. The cache dir is usually at `.cache/unreflectanything`

---

### Basic Python Usage

```python
import unreflectanything
import torch

# Load the pretrained model (uses cached weights)
unreflect_model = unreflectanything.model() 

# Run inference on a tensor [B, 3, H, W] in range [0, 1]
images = torch.rand(2, 3, 448, 448).cuda()
diffuse_output = unreflect_model(images) 

# Simple file-based inference
unreflectanything.inference("input_with_highlights.png", output="diffuse_result.png")
```
Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the API endpoints

---

### CLI Overview

The package provides a comprehensive command-line interface via `ura`, `unreflect`, or `unreflectanything`.

* **Inference**: `ura inference --input /path/to/images --output /path/to/output`
* **Evaluation**: `ura evaluate --output /path/to/results --gt /path/to/groundtruth`
* **Verification**: `ura verify --dataset /path/to/dataset`

Refer to the [Wiki](https://github.com/alberto-rota/UnReflectAnything/wiki) for all details on the CLI endpoints

---

## Citation

If you use UnReflectAnything in your research or pipeline, please cite our paper:

```bibtex
@misc{rota2025unreflectanythingrgbonlyhighlightremoval,
      title={UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision}, 
      author={Alberto Rota and Mert Kiray and Mert Asim Karaoglu and Patrick Ruhkamp and Elena De Momi and Nassir Navab and Benjamin Busam},
      year={2025},
      eprint={2512.09583},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={[https://arxiv.org/abs/2512.09583](https://arxiv.org/abs/2512.09583)}, 
}

```

---