File size: 15,353 Bytes
9fd67db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 |
---
language:
- en
license: apache-2.0
library_name: peft
tags:
- medical
- healthcare
- question-answering
- conversational-ai
- medical-qa
- clinical-nlp
- lora
- medphi
- patient-education
base_model: microsoft/MediPhi-Instruct
datasets:
- private
pipeline_tag: text-generation
model-index:
- name: medphi-medical-qa-adapter
results:
- task:
type: question-answering
name: Medical Question Answering
dataset:
name: Medical Screening Dataset
type: custom
metrics:
- name: Training Loss
type: loss
value: 0.6441
- name: Validation Loss
type: loss
value: 0.6446
---
# MediPhi Medical QA Adapter
This is a LoRA adapter fine-tuned on Microsoft's [MediPhi-Instruct](https://huggingface.co/microsoft/MediPhi-Instruct) for medical question-answering. The model is designed to provide comprehensive, accurate answers to questions about medical diseases, conditions, and health-related topics.
## Model Description
- **Model Type:** LoRA Adapter for Causal Language Model
- **Base Model:** microsoft/MediPhi-Instruct (3.8B parameters)
- **Trainable Parameters:** 0.328% (12.5M parameters via LoRA)
- **Language:** English
- **Domain:** Medical/Healthcare
- **Task:** Question Answering, Conversational AI
- **License:** Apache 2.0
### Model Purpose
This model serves as a medical assistant chatbot capable of answering user queries about medical conditions, diseases, symptoms, treatments, and genetic disorders. It has been fine-tuned on 16,406 medical Q&A pairs covering a wide range of health topics including rare genetic disorders and common medical conditions.
## Key Features
- **Medical Domain Expertise:** Trained on diverse medical Q&A covering diseases and conditions
- **Comprehensive Responses:** Generates detailed explanations including definitions, causes, symptoms, and treatments
- **Step-by-Step Reasoning:** Employs structured thinking for medical information delivery
- **Efficient Fine-tuning:** Uses 4-bit quantization with LoRA for memory efficiency
- **Patient Education Focus:** Optimized for explaining complex medical concepts clearly
## Training Data
### Dataset Statistics
- **Total Q&A Pairs:** 16,406 medical question-answer pairs
- **Dataset Size:** 21 MB
- **Data Splits:**
- Train: 12,304 samples (75%)
- Validation: 2,051 samples (12.5%)
- Test: 2,051 samples (12.5%)
### Data Coverage
The dataset covers a wide range of medical topics including:
- **Rare Genetic Disorders:** Tourette syndrome, Denys-Drash syndrome, etc.
- **Common Conditions:** Dry eye syndrome, immunodeficiency disorders
- **Medical Concepts:** Genetic inheritance patterns, diagnostic methods
- **Treatment Information:** Management strategies, preventive care
### Data Format
```python
{
"messages": [
{
"role": "system",
"content": "You are a knowledgeable medical assistant. Provide accurate information about medical conditions and diseases. Always think step by step."
},
{
"role": "user",
"content": "What is [medical condition]?"
},
{
"role": "assistant",
"content": "[Comprehensive medical explanation]"
}
]
}
```
## Training Details
### Training Configuration
- **Framework:** PyTorch with Hugging Face Transformers
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) with SFT (Supervised Fine-Tuning)
- **Quantization:** 4-bit NF4 with double quantization
- **Compute:** Single RTX 5090 GPU (16 vCPU, 141 GB RAM)
- **Training Time:** ~140 steps to convergence
### LoRA Hyperparameters
```python
{
"r": 8,
"lora_alpha": 32,
"target_modules": ["o_proj", "qkv_proj", "gate_up_proj", "down_proj"],
"lora_dropout": 0.05,
"bias": "none",
"task_type": "CAUSAL_LM"
}
```
### Training Hyperparameters
```python
{
"num_train_epochs": 3,
"per_device_train_batch_size": 4,
"gradient_accumulation_steps": 8,
"learning_rate": 2e-4,
"lr_scheduler_type": "cosine",
"max_seq_length": 1024,
"optim": "adamw_torch",
"gradient_checkpointing": True,
"packing": True
}
```
### Quantization Configuration
```python
{
"load_in_4bit": True,
"bnb_4bit_quant_type": "nf4",
"bnb_4bit_use_double_quant": True,
"bnb_4bit_compute_dtype": "bfloat16"
}
```
## Performance
### Training Convergence
| Step | Training Loss | Validation Loss |
|------|---------------|-----------------|
| 20 | 1.1799 | 0.8300 |
| 40 | 0.7834 | 0.7168 |
| 60 | 0.7185 | 0.6892 |
| 80 | 0.6838 | 0.6710 |
| 100 | 0.6641 | 0.6592 |
| 120 | 0.6638 | 0.6515 |
| 140 | 0.6441 | 0.6446 |
**Key Observations:**
- Rapid convergence within 140 training steps
- Training and validation loss converged, indicating good generalization
- No significant overfitting observed
### Qualitative Improvements
**Example 1 - Dry Eye Condition**
*Original Dataset Response:* Citations and contact information only
*Fine-tuned Model Response:* Comprehensive explanation covering:
- Definition and mechanism
- Environmental, aging, and medication-related causes
- Symptoms (gritty sensation, redness, blurred vision, light sensitivity)
- Treatment options (artificial tears, lifestyle modifications, medical interventions)
**Example 2 - Genetic Disorders**
*Original Dataset Response:* Basic definition of 3 types
*Fine-tuned Model Response:* Expanded information including:
- Inheritance patterns (autosomal dominant/recessive, X-linked, mitochondrial)
- Specific examples (cystic fibrosis, sickle cell disease, Huntington's disease, Down syndrome)
- Diagnostic methods and genetic testing
- Management strategies and treatment approaches
- Prevention through genetic counseling
## Usage
### Installation
```bash
pip install torch transformers peft bitsandbytes accelerate
```
### Basic Usage
```python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
# Load model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
"sabber/medphi-medical-qa-adapter",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("sabber/medphi-medical-qa-adapter")
# Prepare conversation
messages = [
{
"role": "system",
"content": "You are a knowledgeable medical assistant. Provide accurate information about medical conditions and diseases. Always think step by step."
},
{
"role": "user",
"content": "What is Type 2 Diabetes and what are its main symptoms?"
}
]
# Tokenize and generate
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Pipeline Usage
```python
from transformers import pipeline
# Create conversational pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
# Ask medical question
messages = [
{"role": "system", "content": "You are a knowledgeable medical assistant. Provide accurate information about medical conditions and diseases. Always think step by step."},
{"role": "user", "content": "What causes high blood pressure?"}
]
result = pipe(messages)
print(result[0]['generated_text'][-1]['content'])
```
### Multi-Turn Conversation
```python
conversation_history = [
{
"role": "system",
"content": "You are a knowledgeable medical assistant. Provide accurate information about medical conditions and diseases. Always think step by step."
}
]
# First question
conversation_history.append({"role": "user", "content": "What is asthma?"})
response = generate_response(conversation_history)
conversation_history.append({"role": "assistant", "content": response})
# Follow-up question
conversation_history.append({"role": "user", "content": "What triggers asthma attacks?"})
response = generate_response(conversation_history)
print(response)
```
### Merging Adapter with Base Model
```python
from peft import AutoPeftModelForCausalLM
# Load and merge
model = AutoPeftModelForCausalLM.from_pretrained(
"sabber/medphi-medical-qa-adapter",
torch_dtype="auto",
device_map="auto"
)
merged_model = model.merge_and_unload()
# Save merged model
merged_model.save_pretrained("medphi-medical-qa-merged")
tokenizer.save_pretrained("medphi-medical-qa-merged")
```
## System Prompt
The model uses the following system prompt for optimal performance:
```
You are a knowledgeable medical assistant. Provide accurate information about
medical conditions and diseases. Always think step by step.
```
This prompt encourages:
- **Structured reasoning:** Step-by-step explanations
- **Accuracy focus:** Emphasis on providing correct medical information
- **Comprehensive coverage:** Detailed responses covering multiple aspects
## Limitations and Bias
### Limitations
1. **Training Data Scope:** Model trained on 16,406 Q&A pairs; may not cover all medical conditions
2. **Not a Medical Professional:** Cannot replace professional medical advice or diagnosis
3. **Language:** English only
4. **Clinical Validation:** Outputs should be reviewed by healthcare professionals before clinical application
5. **Rare Conditions:** Performance may vary for extremely rare or newly discovered conditions
6. **Quantization Effects:** 4-bit quantization may affect precision in certain edge cases
### Bias Considerations
- **Dataset Bias:** Training data may reflect biases present in medical literature
- **Language Bias:** Trained exclusively on English medical content
- **Regional Bias:** May reflect medical practices and terminology from specific regions
- **Completeness:** May provide more detailed responses for well-documented conditions
### Ethical Considerations
- **Not for Diagnosis:** This model should NOT be used for self-diagnosis or medical decision-making
- **Professional Review Required:** All outputs must be reviewed by qualified healthcare professionals
- **Patient Safety:** Users should always consult with licensed medical professionals for health concerns
- **Transparency:** Users should be informed when AI-generated medical content is provided
- **Privacy:** Do not share personally identifiable health information when using this model
## Intended Use
### Primary Use Cases
β
**Medical Education:** Teaching medical concepts and terminology
β
**Patient Information:** Providing general information about conditions and diseases
β
**Research Assistant:** Helping researchers understand medical concepts
β
**Content Generation:** Creating draft content for medical education materials
β
**Conversational AI:** Building medical information chatbots and assistants
### Out-of-Scope Use
β **Clinical Diagnosis:** Not validated for diagnostic purposes
β **Treatment Planning:** Not suitable for creating treatment plans
β **Emergency Response:** Not appropriate for emergency medical situations
β **Prescription Decisions:** Cannot be used for medication recommendations
β **Mental Health Crisis:** Not designed for crisis intervention or counseling
β **Legal/Medical Records:** Not validated for official medical documentation
## Evaluation Benchmarks
The model has been prepared for evaluation on standard medical benchmarks:
- **MEDQA:** Medical Question Answering benchmark
- **MEDMCQA:** Multiple Choice Medical Questions
- **PubMedQA:** Biomedical literature question answering
- **MMLU Medical Subsets:**
- Anatomy
- Clinical Knowledge
- College Medicine
- Medical Genetics
- Professional Medicine
*Note: Comprehensive benchmark results will be added as evaluation completes.*
## Future Improvements
Suggested enhancements based on current limitations:
1. **Increase LoRA Rank:** Higher rank for greater model capacity
2. **Full Precision Training:** Use FP32 or FP16 instead of 4-bit quantization
3. **Data Augmentation:** Expand training data with more diverse medical sources
4. **Error Analysis:** Systematic analysis of model failure cases
5. **Benchmark Evaluation:** Complete evaluation on medical QA benchmarks
6. **Multi-lingual Support:** Extend to support multiple languages
7. **Clinical Validation:** Formal evaluation by medical professionals
## Model Architecture
### Base Model: MediPhi-Instruct (Phi-3.5-mini-instruct)
**Key Components:**
- **Parameters:** 3.8 billion
- **Vocabulary Size:** 32,064 tokens
- **Hidden Dimension:** 3,072
- **Layers:** 32 Phi3DecoderLayers
- **Attention:** Multi-head self-attention with rotary positional embeddings
- **Activation:** SiLU (Swish) activation function
- **Normalization:** RMSNorm layer normalization
**LoRA Target Modules:**
- `o_proj` - Output projection in attention
- `qkv_proj` - Query-Key-Value projection in attention
- `gate_up_proj` - Gate and up projection in MLP
- `down_proj` - Down projection in MLP
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{medphi-medical-qa-adapter,
author = {Sabber Ahamed},
title = {MediPhi Medical QA Adapter: LoRA Fine-tuning for Medical Question Answering},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/sabber/medphi-medical-qa-adapter}},
note = {Fine-tuned on 16,406 medical Q&A pairs for patient education and medical information retrieval}
}
```
Please also cite the base MediPhi model:
```bibtex
@article{medphi2024,
title={MediPhi: A Medical Language Model},
author={Microsoft Research},
journal={arXiv preprint},
year={2024}
}
```
## Model Card Authors
Sabber Ahamed
## Model Card Contact
For questions, issues, or feedback, please:
- Open an issue on the [model repository](https://huggingface.co/sabber/medphi-medical-qa-adapter/discussions)
- Contact via Hugging Face profile
## Acknowledgments
- **Base Model:** Microsoft MediPhi-Instruct team
- **Framework:** Hugging Face Transformers, PEFT, and TRL libraries
- **Compute:** GPU infrastructure for model training
- **Community:** Open-source ML and medical NLP communities
## Additional Resources
- **Training Code:** Available in project repository
- **Evaluation Scripts:** Provided for reproducibility
- **Documentation:** Comprehensive README with implementation details
---
**Medical Disclaimer:** This model is provided for educational and research purposes only. It is NOT approved for clinical use, medical diagnosis, or treatment planning. All medical information should be verified by qualified healthcare professionals. In case of medical emergencies, contact emergency services immediately. Always consult with licensed medical professionals for health concerns and treatment decisions.
**Technical Disclaimer:** This model may generate incorrect or incomplete information. Users should verify all outputs and use appropriate safeguards when deploying in production environments. The model's responses should be reviewed and validated before any public-facing use.
|