VAAS: Vision-Attention Anomaly Scoring

Model Summary

VAAS (Vision-Attention Anomaly Scoring) is a dual-module vision framework for image anomaly detection and localization. It combines global attention-based reasoning with patch-level self-consistency analysis to produce a continuous, interpretable anomaly score alongside dense spatial anomaly maps.

Rather than making binary decisions, VAAS estimates where anomalies occur and how strongly they deviate from learned visual regularities, enabling explainable image analysis and integrity assessment.

Read Paper

Architecture Overview

VAAS consists of two complementary components:

Global Attention Module (Fx)
A Vision Transformer backbone that captures global semantic and structural irregularities using attention distributions.
Patch-Level Module (Px)
A SegFormer-based segmentation model that identifies local inconsistencies in texture, boundaries, and regions.

These components are combined via a hybrid scoring mechanism:

S_F: Global attention fidelity score
S_P: Patch-level plausibility score
S_H: Final hybrid anomaly score

S_H provides a continuous measure of anomaly intensity rather than a binary decision.

Intended Use

This model can be used for:

Image anomaly detection
Visual integrity assessment
Explainable inspection of irregular regions
Research on attention-based anomaly scoring
Prototyping anomaly-aware vision systems

It supports CPU-only inference and GPU-accelerated inference. GPU usage is recommended for faster processing but is not required.

Installation

VAAS is distributed as a lightweight inference library and can be installed instantly.

PyTorch is only required when running inference or loading pretrained models.
This allows users to inspect, install, and integrate VAAS without heavy dependencies.

This model was produced using vaas==0.1.7, but newer versions of VAAS may also be compatible for inference.

1. Install PyTorch

To run inference or load pretrained VAAS models, install PyTorch and torchvision for your system (CPU or GPU). Follow the official PyTorch installation guide for your platform:

https://pytorch.org/get-started/locally/

Quick installation

pip install torch torchvision

2. Install VAAS

pip install vaas

VAAS will automatically detect PyTorch at runtime and raise a clear error if it is missing.

Usage

1. Basic inference on local and online images

from vaas.inference.pipeline import VAASPipeline
from PIL import Image
import requests
from io import BytesIO

pipeline = VAASPipeline.from_pretrained(
    "OBA-Research/vaas-v1-df2023",
    device="cpu",
    alpha=0.5
)

# # Option A: Using a local image
# image = Image.open("example.jpg").convert("RGB")
# result = pipeline(image)

# Option B: Using an online image
url = "https://raw.githubusercontent.com/OBA-Research/VAAS/main/examples/images/COCO_DF_C110B00000_00539519.jpg"
image = Image.open(BytesIO(requests.get(url).content)).convert("RGB")
result = pipeline(image)

print(result)
anomaly_map = result["anomaly_map"]

Output Format

{
  "S_F": float,
  "S_P": float,
  "S_H": float,
  "anomaly_map": numpy.ndarray  # shape (224, 224)
}

2. Inference with visual explanation

VAAS can also generate a qualitative visualization combining:

Patch-level anomaly heatmaps (Px)
Global attention maps (Fx)
Final hybrid anomaly score (S_H)


pipeline.visualize(
    image=image,
    save_path="vaas_visualization.png",
    mode="all",        # options: "all", "px", "binary", "fx"
    threshold=0.5,
)

This will save a figure containing:

Original image
Patch-level anomaly overlays
Global attention overlays
A gauge-style visualization of the hybrid anomaly score

For examples:

Model Variant

This release corresponds to:

VAAS v1
Trained on 10% of the DF2023 dataset
Input resolution: 224 × 224
Outputs:
- Global anomaly score (S_H)
- Component scores (S_F, S_P)
- Dense anomaly map (224 × 224)

Future releases will scale training data size, include cross-dataset evaluation, and explore model compression.

Model Files

This repository contains:

px_model.pth – Patch-level SegFormer model weights
ref_stats.pth – Reference statistics for anomaly normalization
config.json – Model configuration metadata

The Vision Transformer backbone is loaded programmatically during inference.

Training Data

The model was trained on a reproducible 10% subset of the DF2023 dataset. The exact filenames used for training are released to support experimental reproducibility.

Limitations

Trained on a subset of a single dataset
Does not classify anomaly types
Performance may degrade on out-of-distribution imagery

Users are encouraged to fine-tune or retrain the model for domain-specific applications.

Ethical Considerations

VAAS is intended for research and inspection purposes. It should not be used as a standalone decision-making system in high-stakes or sensitive applications without human oversight.

Citation

If you use VAAS in your research, please cite both the software and the associated paper as appropriate.

@software{vaas,
  title        = {VAAS: Vision-Attention Anomaly Scoring},
  author       = {Bamigbade, Opeyemi and Scanlon, Mark and Sheppard, John},
  year         = {2025},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.18064355},
  url          = {https://doi.org/10.5281/zenodo.18064355}
}

@article{bamigbade2025vaas,
  title={VAAS: Vision-Attention Anomaly Scoring for Image Manipulation Detection in Digital Forensics},
  author={Bamigbade, Opeyemi and Scanlon, Mark and Sheppard, John},
  journal={arXiv preprint arXiv:2512.15512},
  year={2025}
}

Contributing

We welcome contributions that improve the usability, robustness, and extensibility of VAAS.

Please see the full guidelines on Github in CONTRIBUTING.md.

License

MIT License

Maintainers

OBA-Research

Downloads last month: 21