DINOHash — extra checkpoints

Additional DINOHash perceptual-hashing models not present in backslashh/DINOHash. Each model is provided both as the raw training/traced artifact (raw/) and as an exported ONNX graph (repo root, dynamic batch axis, opset 17).

Model	ONNX	Raw	Notes
ViT-Small → ViT-Tiny (DINO distill)	`ViT-Small-ViT-Tiny.onnx`	`raw/ViT-Small-ViT-Tiny.pth`	student backbone (`vit_tiny_patch16_224`), 192-d embedding
XCiT-Small → XCiT-Tiny (DINO distill)	`XCiT-Small-XCiT-Tiny.onnx`	`raw/XCiT-Small-XCiT-Tiny.pth`	student backbone (`xcit_tiny_12_p16_224`), 192-d embedding
MAE-Lite mae_tiny_400e	`mae_tiny_400e_traced.onnx`	`raw/mae_tiny_400e_traced.pt`	192-d
MAE-Lite mae_tiny_distill_400e	`mae_tiny_distill_400e_traced.onnx`	`raw/mae_tiny_distill_400e_traced.pt`	192-d
MAE-Lite mae_tiny_distill_d2_400e	`mae_tiny_distill_d2_400e_traced.onnx`	`raw/mae_tiny_distill_d2_400e_traced.pt`	192-d
MAE-Lite mocov3_tiny_400e	`mocov3_tiny_400e_traced.onnx`	`raw/mocov3_tiny_400e_traced.pt`	192-d

Notes on the raw files

MAE-Lite raw files are TorchScript (_traced.pt), self-contained and loadable directly.
ViT / XCiT raw files are full DINO training checkpoints (student/teacher/optimizer/...). The ONNX graphs were built by extracting the student.backbone.* weights into the matching timm architecture (strict-clean load) and exporting; XCiT required pos_embeder→pos_embed rename and qkv split/fuse between class-attention and XCA blocks.

All inputs are (batch, 3, 224, 224).

Downloads last month: 16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support