Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Mozilla
/
distilvit
like
24
Follow
mozilla
384
Image-to-Text
Transformers.js
PyTorch
ONNX
Safetensors
Mozilla/flickr30k-transformed-captions-gpt4o
vision-encoder-decoder
image-captioning
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
2
Use this model
main
distilvit
5.32 GB
1 contributor
History:
58 commits
tarekziade
Update README.md
e78ec2f
verified
about 1 year ago
onnx
Added q4
over 1 year ago
.gitattributes
1.52 kB
initial commit
almost 2 years ago
README.md
2.09 kB
Update README.md
about 1 year ago
config.json
5.05 kB
New training round
over 1 year ago
generation_config.json
7.02 kB
Upload generation_config.json
over 1 year ago
merges.txt
456 kB
fined tuned on alt-text-validation
over 1 year ago
metrics.txt
657 Bytes
Model save
over 1 year ago
model.safetensors
730 MB
xet
fined tuned on alt-text-validation
over 1 year ago
preprocessor_config.json
349 Bytes
New training round
over 1 year ago
pytorch_model.bin
730 MB
xet
New training round
over 1 year ago
quantize_config.json
3.24 kB
New training round
over 1 year ago
special_tokens_map.json
137 Bytes
New training round from scratch
over 1 year ago
tokenizer.json
2.11 MB
fined tuned on alt-text-validation
over 1 year ago
tokenizer_config.json
243 Bytes
New training round from scratch
over 1 year ago
training_args.bin
4.73 kB
xet
New training round
over 1 year ago
vocab.json
798 kB
fined tuned on alt-text-validation
over 1 year ago