richardyoung's picture
Cardiology embedding model (separation: 0.386)
c2f37c1 verified
metadata
library_name: peft
base_model: sentence-transformers/all-mpnet-base-v2
tags:
  - medical
  - cardiology
  - embeddings
  - domain-adaptation
  - lora
  - sentence-transformers
  - sentence-similarity
language:
  - en
license: apache-2.0

CardioEmbed-MPNet-base

Domain-specialized cardiology text embeddings using LoRA-adapted MPNet-base

Part of a comparative study of 10 embedding architectures for clinical cardiology.

Performance

Metric Score
Separation Score 0.386

Usage

from transformers import AutoModel, AutoTokenizer
from peft import PeftModel

base_model = AutoModel.from_pretrained("sentence-transformers/all-mpnet-base-v2")
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-mpnet-base-v2")
model = PeftModel.from_pretrained(base_model, "richardyoung/CardioEmbed-MPNet-base")

Training

  • Training Data: 106,535 cardiology text pairs from medical textbooks
  • Method: LoRA fine-tuning (r=16, alpha=32)
  • Loss: Multiple Negatives Ranking Loss (InfoNCE)

Citation

@article{young2024comparative,
  title={Comparative Analysis of LoRA-Adapted Embedding Models for Clinical Cardiology Text Representation},
  author={Young, Richard J and Matthews, Alice M},
  journal={arXiv preprint},
  year={2024}
}