🧬 OncoAgent v1.0 — 9B (Tier 1)

QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage

AMD Developer Hackathon 2026 · Trained on AMD Instinct™ MI300X · ROCm 7.2

Model Description

OncoAgent v1.0 9B is a QLoRA fine-tuned LoRA adapter built on top of Qwen/Qwen3.5-9B, specialized for clinical oncology triage and treatment recommendation.

This is the Tier 1 (fast triage) model in the OncoAgent multi-agent system, optimized for:

Rapid cancer type classification and routing
Clinical entity extraction (symptoms, staging, biomarkers)
First-pass treatment recommendations based on NCCN/ESMO guidelines

Training Details

Parameter	Value
Base Model	Qwen/Qwen3.5-9B
Method	QLoRA (4-bit NormalFloat4)
Framework	Unsloth + PEFT + TRL
Hardware	AMD Instinct™ MI300X (192GB HBM3)
Software	ROCm 7.2 · PyTorch 2.3+
LoRA Rank	32
LoRA Alpha	32
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Samples	240,168 (+ 26,686 eval)
Max Sequence Length	2,048 tokens
Batch Size	8 (gradient accumulation: 2 → effective: 16)
Learning Rate	2e-4 (cosine schedule)
Epochs	1
Precision	BF16 (native MI300X)
Seed	42 (reproducible)

Dataset

Trained on MaximoLopezChenlo/OncoAgent-Clinical-266K, a curated oncology dataset combining:

PMC-Patients — Real clinical case presentations
PubMedQA — Evidence-based medical Q&A
OncoCoT — Chain-of-thought oncology reasoning (synthetic)
NCCN/ESMO Guidelines — Structured guideline extracts

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.5-9B",
    device_map="auto",
    torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    "MaximoLopezChenlo/OncoAgent-v1.0-9B",
)

# Inference
messages = [
    {"role": "system", "content": "You are a clinical oncology specialist."},
    {"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM Deployment (AMD MI300X)

# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen3.5-9B \
    --enable-lora \
    --lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
    --dtype bfloat16 \
    --tensor-parallel-size 1 \
    --gpu-memory-utilization 0.45

Architecture

OncoAgent v1.0 9B serves as the Tier 1 model in a dual-tier architecture:

Clinical Case → Router → [Tier 1: 9B] → Specialist → Critic → Output
                    ↓
              (Complex cases)
                    ↓
              [Tier 2: 27B] → Specialist → Critic → Output

Citation

@misc{oncoagent2026,
  title={OncoAgent: Multi-Agent Oncology Triage System},
  author={Lopez Chenlo, Maximo},
  year={2026},
  howpublished={AMD Developer Hackathon 2026},
  url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}

License

Apache 2.0 — This adapter is for research and educational purposes only. Not intended for direct clinical use without professional medical oversight.

Downloads last month: -

Model tree for lablab-ai-amd-developer-hackathon/OncoAgent-v1.0-9B

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Adapter

(152)

this model

lablab-ai-amd-developer-hackathon
/

OncoAgent-v1.0-9B