Compact_VLM_filter / README.md

Dauka-transformers

Update README.md

a875136 verified 6 months ago

preview code

raw

history blame contribute delete

1.13 kB

metadata

license: apache-2.0
datasets:
  - Dauka-transformers/Compact_VLM_filter_data
language:
  - en
base_model:
  - Qwen/Qwen2-VL-2B-Instruct

Compact VLM Filter: Image-caption filtration-oriented Qwen2VL model

This model is a fine-tuned version of Qwen/Qwen2-VL-2B-Instruct trained to perform filtration-oriented image-text evaluation, based on our custom dataset.

🔍 Intended Use

The model is designed to:

Evaluate alignment of image and caption
Provide image/caption alignment scores and textual justification for noisy web-scale data
Supports local deployment for cost-efficient training data filtration

🏋️ Training Details

Base model: Qwen/Qwen2-VL-2B-Instruct
Fine-tuning objective: in-context evaluation of aligment, quality and safety
Dataset: ~4.8K samples with score, justification, caption, and image

🤝 Acknowledgements

Thanks to the Qwen team for open-sourcing their VLM models, which serve as the foundation for our filtration-oriented model.

📜 License

Licensed under the Apache License 2.0.