HanClinto
/

milo

CollectorVision

image-retrieval

metric-learning

Model card Files Files and versions

milo / README.md

HanClinto's picture

Add CollectorVision library metadata

290fa81 verified 25 days ago

|

history blame contribute delete

2.23 kB

	---
	license: agpl-3.0
	library_name: collectorvision
	tags:
	- onnx
	- image-retrieval
	- metric-learning
	- arcface
	- mobilevit
	base_model: apple/mobilevit-xx-small
	---

	# Milo — CCG Card Embedder

	MobileViT-XXS backbone trained with ArcFace loss (multitask: illustration_id + set_code) to produce 128-dimensional L2-normalised embeddings of CCG card images for nearest-neighbour retrieval.

	## Model details

	\| Property \| Value \|
	\|---\|---\|
	\| Architecture \| MobileViT-XXS + linear projection \|
	\| Input \| 448×448 RGB, ImageNet-normalised \|
	\| Output \| 128-d L2-normalised embedding vector \|
	\| Parameters \| ~1.0M \|
	\| File size \| 5.2 MB (fp32 ONNX) \|
	\| Codename \| milo \|
	\| Version \| 1.0.0 (epoch 15) \|
	\| Training labels \| illustration_id + set_code (multitask ArcFace) \|

	## Usage

	The easiest way to use Milo is through the [CollectorVision](https://github.com/HanClinto/CollectorVision) library, which handles corner detection, dewarping, gallery loading, and nearest-neighbour search:

	```python
	import collector_vision as cvg

	cvid = cvg.Identifier(cvg.HFD("HanClinto/milo", "scryfall-mtg"))
	result = cvid.identify("photo.jpg")
	print(result.ids) # {"scryfall_id": "..."}
	print(result.confidence) # 0.94
	```

	### Direct ONNX usage

	```python
	import onnxruntime as ort
	import numpy as np
	from PIL import Image

	session = ort.InferenceSession("model.onnx")

	# Preprocess: resize to 448×448, ImageNet normalise, NCHW float32
	img = Image.open("card_crop.jpg").convert("RGB").resize((448, 448))
	x = np.array(img, dtype=np.float32) / 255.0
	x = (x - [0.485, 0.456, 0.406]) / [0.229, 0.224, 0.225]
	x = x.transpose(2, 0, 1)[None] # (1, 3, 448, 448)

	emb = session.run(None, {"pixel_values": x})[0] # (1, 128) float32, L2-normalised
	```

	Cosine similarity between two embeddings is just their dot product (both are unit vectors).

	## Gallery compatibility

	Gallery files built with Milo v1.0.0 use `milo1` in their filename. Embeddings from different Milo versions are not compatible — rebuild the gallery when upgrading.

	## Part of CollectorVision

	Used together with [HanClinto/cornelius](https://huggingface.co/HanClinto/cornelius) in the [CollectorVision](https://github.com/HanClinto/CollectorVision) inference library.