𧬠OncoAgent v1.0 β 9B (Tier 1)
QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage
AMD Developer Hackathon 2026 Β· Trained on AMD Instinctβ’ MI300X Β· ROCm 7.2
Model Description
OncoAgent v1.0 9B is a QLoRA fine-tuned LoRA adapter built on top of Qwen/Qwen3.5-9B, specialized for clinical oncology triage and treatment recommendation.
This is the Tier 1 (fast triage) model in the OncoAgent multi-agent system, optimized for:
- Rapid cancer type classification and routing
- Clinical entity extraction (symptoms, staging, biomarkers)
- First-pass treatment recommendations based on NCCN/ESMO guidelines
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-9B |
| Method | QLoRA (4-bit NormalFloat4) |
| Framework | Unsloth + PEFT + TRL |
| Hardware | AMD Instinctβ’ MI300X (192GB HBM3) |
| Software | ROCm 7.2 Β· PyTorch 2.3+ |
| LoRA Rank | 32 |
| LoRA Alpha | 32 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Samples | 240,168 (+ 26,686 eval) |
| Max Sequence Length | 2,048 tokens |
| Batch Size | 8 (gradient accumulation: 2 β effective: 16) |
| Learning Rate | 2e-4 (cosine schedule) |
| Epochs | 1 |
| Precision | BF16 (native MI300X) |
| Seed | 42 (reproducible) |
Dataset
Trained on MaximoLopezChenlo/OncoAgent-Clinical-266K, a curated oncology dataset combining:
- PMC-Patients β Real clinical case presentations
- PubMedQA β Evidence-based medical Q&A
- OncoCoT β Chain-of-thought oncology reasoning (synthetic)
- NCCN/ESMO Guidelines β Structured guideline extracts
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.5-9B",
device_map="auto",
torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"MaximoLopezChenlo/OncoAgent-v1.0-9B",
)
# Inference
messages = [
{"role": "system", "content": "You are a clinical oncology specialist."},
{"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
vLLM Deployment (AMD MI300X)
# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen3.5-9B \
--enable-lora \
--lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
--dtype bfloat16 \
--tensor-parallel-size 1 \
--gpu-memory-utilization 0.45
Architecture
OncoAgent v1.0 9B serves as the Tier 1 model in a dual-tier architecture:
Clinical Case β Router β [Tier 1: 9B] β Specialist β Critic β Output
β
(Complex cases)
β
[Tier 2: 27B] β Specialist β Critic β Output
Links
- π Demo: HF Space
- π GitHub: maximolopezchenlo-lab/OncoAgent
- π Tier 2 Model: OncoAgent-v1.0-27B
- π Dataset: OncoAgent-Clinical-266K
Citation
@misc{oncoagent2026,
title={OncoAgent: Multi-Agent Oncology Triage System},
author={Lopez Chenlo, Maximo},
year={2026},
howpublished={AMD Developer Hackathon 2026},
url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}
License
Apache 2.0 β This adapter is for research and educational purposes only. Not intended for direct clinical use without professional medical oversight.
- Downloads last month
- -