birdclass-na

A bird species classifier optimized for North American backyard and camera-trap conditions β€” partial occlusion, motion blur, fence/leaf clutter, low-light, and other things you don't see in handheld photo datasets but you do see when a feeder camera is the photographer.

Backbone: facebook/dinov2-base (Apache-2.0), with a Linear(768, 407) classification head trained on a unified taxonomy spanning gpiosenka 525, NABirds, iNat21-Birds, and the BirdWatcher yard dataset.

What this model is good at

  • North American backyard bird identification, especially under feeder-camera conditions (partial occlusion, motion blur, leaf clutter, low-light) β€” the training mix includes ~2,000 user-labeled crops from a real feeder-camera deployment alongside the public datasets.
  • Fine-grained discrimination of common NA confusables (Mourning Dove vs Rock Pigeon, Cooper's Hawk vs Sharp-shinned Hawk).

What this model is not good at

  • Non-NA species: most non-NA bird images in iNat21 were collapsed into a single OTHER bucket during training. The model can flag a bird as "not one of these 406 NA species" but can't tell you which non-NA species it is.
  • Rare-species long tail: NA species with very few training samples (< 30 each) have low individual accuracy. We're not better than general-purpose bird classifiers there, just smaller.
  • Comparison to Cornell Merlin / iNat CV: those are trained on 10-100Γ— more data and remain stronger in absolute terms on most common-species photos. This model's value is in being open-source, fine-tunable, and stronger on camera-trap conditions.

Benchmarks

BENCHMARK.md

Test set: 27,470 rows held out from gpiosenka 525, NABirds, and the BirdWatcher yard dataset. (No iNat21 test split β€” iNat21 only contributed to train/val.)

All three models scored apples-to-apples in our 407-way canonical taxonomy (406 NA species + OTHER). Comparator outputs are mapped through the same alias table β€” Rock Dove β†’ Rock Pigeon, Cardinalis cardinalis β†’ Northern Cardinal, etc. β€” that our taxonomy builder uses. denisjooo's 525-way logits and birder's 10,000-way logits are max-pooled per canonical bucket; ours predict natively.

Three-way: ours vs denisjooo vs birder-project

Split n Ours denisjooo birder-project
overall 27,470 92.9% (92.6–93.2) 23.8% (23.3–24.4) 89.6% (89.3–90.0)
gpiosenka 2,625 89.0% (87.9–90.1) 99.0% (98.7–99.4) 85.3% (84.0–86.8)
nabirds 24,633 93.3% (93.0–93.6) 15.9% (15.5–16.4) 90.4% (90.0–90.7)
yard 212 96.2% (93.9–98.6) 10.4% (6.6–14.6) 57.1% (50.9–63.7)

Top-1 with 95 % bootstrap CIs over 1,000 resamples. Bold marks the best in each row.

How to read this

  • Overall: we beat both alternatives. The +3.3 pp lead over birder is outside the CI overlap.
  • NABirds (the cleanest NA-species test split, n=24,633): we beat birder by +2.9 pp on the source they'd be most expected to win.
  • Yard (real feeder-camera crops with motion blur / partial occlusion / fence clutter, n=212): we beat birder by +39.1 pp. This is the validation of our "domain fine-tune on production yard data" thesis. Birder's iNat21-only training has no exposure to camera-trap conditions.
  • gpiosenka: denisjooo wins (+10 pp over us) because gpiosenka's test split is its training data's holdout. We beat birder by +3.7 pp on this split despite the disadvantage.

What this means for use cases

  • Best for backyard / feeder-camera / camera-trap conditions: ours, by ~40 pp over the nearest competitor.
  • Best for clean handheld iNat-style photos: birder is solid, especially if you also need plants / fungi / insects from the same model.
  • Best for the gpiosenka 525-species test specifically: denisjooo (it was trained on those labels).
  • Best for "is this a bird I should care about?" with built-in NAB suppression: ours (the OTHER class threshold gives a clean reject signal).

Training data

  • gpiosenka 525: ~89,885 images across 525 species. Pulled from yashikota's HF mirror (the original gpiosenka Kaggle upload was removed in 2025).
  • NABirds v1: ~48,000 expert-labeled NA bird images from Cornell. Used under academic license β€” see https://dl.allaboutbirds.org/nabirds.
  • iNat21-Birds: bird subset (~414k images) of the iNat 2021 challenge, filtered to the Aves supercategory. License: CC-BY-NC. This is why the trained model weights inherit a non-commercial restriction.
  • Yard data: ~5,000 labeled crops from the BirdWatcher project. Domain-adaptation stage only.

Quick start

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

processor = AutoImageProcessor.from_pretrained("houlette/birdclass-na")
model = AutoModelForImageClassification.from_pretrained("houlette/birdclass-na")

img = Image.open("your_bird.jpg")
inputs = processor(images=img, return_tensors="pt")
outputs = model(**inputs)
top1 = outputs.logits.softmax(dim=-1)[0].argmax().item()
print(model.config.id2label[top1])

Limitations and honest claims

This model is best-in-class among open-source bird classifiers for NA backyard / camera-trap use β€” see the benchmark table above for the head-to-head numbers vs the strongest alternatives we found (denisjooo's EfficientNet and birder-project's Hiera-DINOv2-iNat21). We win overall (+3.3 pp over birder, +69 pp over denisjooo) and by ~40 pp on real yard / camera-trap conditions.

It is not absolute SOTA on bird classification benchmarks. Cornell's Merlin Bird ID app and iNaturalist's internal classifier are both trained on orders of magnitude more data and remain stronger on most common-species, clean-photo scenarios.

Use this model when:

  • You need a local-running, fine-tunable bird classifier.
  • Your inference distribution looks like camera-trap or feeder-camera imagery.
  • You want Apache-2.0 code (the training pipeline) and CC-BY-NC weights with provenance you can audit.

Don't use this model when:

  • You need commercial use (the iNat21 license restricts downstream). Re-train without iNat21 if commercial deployment matters.
  • You need a global bird classifier β€” this is NA-focused by design.

Citation

If you use this model in research, please cite it as:

@misc{birdclass_na_2026,
  author = {Houlette, Ryan},
  title = { birdclass-na: an open-source bird species classifier for North American backyards },
  year = { 2026 },
  publisher = { HuggingFace },
  url = { https://huggingface.co/houlette/birdclass-na }
}

License

Apache-2.0 for the training pipeline at https://github.com/houlette/birdclass-na. Model weights themselves are released under CC-BY-NC-4.0 due to inheritance from iNat21's non-commercial clause.

Downloads last month
23
Safetensors
Model size
87.2M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for houlette/birdclass-na

Finetuned
(94)
this model

Dataset used to train houlette/birdclass-na