YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card โ€” Qwen2-VL-ImgChat-2B

Model Details

  • Model Name: Qwen2-VL-ImgChat-2B
  • Model Type: Vision-Language Model fine-tuned for multimodal dialog auto-completion
  • Language(s): English
  • Base Model: Qwen2-VL-2B
  • Fine-tuning Dataset: ImageChat
  • License: Same as base model (Qwen2-VL license)
  • Repository: https://github.com/devichand579/MAC

Intended Use

Direct Use

This model generates conversational responses conditioned on both textual and visual context. It is suitable for:

  • Multimodal dialog systems
  • Image-grounded conversational agents
  • Research on multimodal auto-completion

Out-of-Scope Use

The model is not intended for:

  • Medical, legal, or financial advice
  • Safety-critical decision-making
  • Autonomous systems requiring guaranteed correctness

Limitations and Risks

  • Model outputs may contain inaccuracies or biases inherited from training data.
  • Performance depends on image relevance and dialogue context quality.
  • The model is not explicitly safety-filtered.

How to Use

Example usage with Hugging Face Transformers:

from transformers import AutoProcessor, AutoModelForVision2Seq

processor = AutoProcessor.from_pretrained("devichand/MiniCPM_V_ImgChat-7B")
model = AutoModelForVision2Seq.from_pretrained("devichand/MiniCPM_V_ImgChat-7B")

inputs = processor(images=your_image,
                   text="Describe the image.",
                   return_tensors="pt")

outputs = model.generate(**inputs)
print(processor.decode(outputs[0]))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including devichand/MiniCPM_V_ImgChat-7B