MedM-VL-CT-Chest-3B-en

Introduction

A 3D medical LVLM trained on 3D chest CT volumes and English medical texts (CT-RATE), enabling tasks such as report generation and medical VQA.

	Config
Image encoder	google/siglip-base-patch16-256-multilingual
Connector	Cross-Attention + MLP (2-layer)
LLM	Qwen/Qwen2.5-3B-Instruct
Image resolution	32256256
Sequence length	2048

Evaluation

Task	CT-CHAT	MedM-VL-CT-Chest (3D)	MedM-VL-CT-Chest (2D+Avg)	MedM-VL-CT-Chest (2D+Attn)
Long answer	0.482	0.619	0.622	0.623
Short answer	0.274	0.658	0.664	0.667
Multiple choice	0.838	0.924	0.920	0.925
Report generation	0.395	0.419	0.441	0.439

Quickstart

Please refer to MedM-VL.

Citation

@inproceedings{shi2025medm,
  title={Medm-vl: What makes a good medical lvlm?},
  author={Shi, Yiming and Yang, Shaoshuai and Zhu, Xun and Wang, Haoyu and Fu, Xiangling and Li, Miao and Wu, Ji},
  booktitle={International Workshop on Agentic AI for Medicine},
  pages={290--299},
  year={2025},
  organization={Springer}
}

Downloads last month: 12

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for shiym2000/MedM-VL-CT-Chest-3B-en

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(859)

this model

Collection including shiym2000/MedM-VL-CT-Chest-3B-en

MedM-VL

Collection

Model weights for 2D/3D medical LVLMs • 3 items • Updated Apr 10 • 1