Add rtmdet-tiny RTMW/RTMDet HF port

a10d212 verified 7 days ago

1.9 kB

license: apache-2.0
tags:
  - object-detection
  - person-detection
  - rtmdet
  - real-time
  - computer-vision
pipeline_tag: object-detection

rtmdet-tiny

This is a Hugging Face-compatible port of rtmdet-tiny from OpenMMLab MMDetection.

RTMDet is a family of real-time object detectors based on the CSPNeXt architecture. This checkpoint is pretrained on COCO and is particularly well-suited for person detection as a first stage before wholebody pose estimation with RTMW.

Model description

Architecture: CSPNeXt backbone + CSPNeXtPAFPN neck + RTMDetHead
Backbone scale: deepen=0.167, widen=0.375 (~~5M parameters)
Input size: 640×640
Classes: 80 (COCO)
Uses custom code — load with trust_remote_code=True

Usage

from transformers import AutoConfig, AutoModel, AutoImageProcessor
from PIL import Image
import torch

config = AutoConfig.from_pretrained("akore/rtmdet-tiny", trust_remote_code=True)
model = AutoModel.from_pretrained("akore/rtmdet-tiny", trust_remote_code=True)
model.eval()

processor = AutoImageProcessor.from_pretrained("akore/rtmdet-tiny")
image = Image.open("your_image.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(pixel_values=inputs["pixel_values"])

# outputs["boxes"]:  (N, 4) in [x1, y1, x2, y2]
# outputs["scores"]: (N,)
# outputs["labels"]: (N,)  — 0 = person in COCO
print(outputs)

Citation

@misc{lyu2022rtmdet,
  title={RTMDet: An Empirical Study of Designing Real-Time Object Detectors},
  author={Chengqi Lyu and Wenwei Zhang and Haian Huang and Yue Zhou and Yudong Wang and Yanyi Liu and Shilong Zhang and Kai Chen},
  year={2022},
  eprint={2212.07784},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}