Buckets:

MaximoLopezChenlo's picture
|
download
raw
7.71 kB
---
title: OncoAgent
emoji: 🧬
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Multi-Agent Oncology Triage powered by AMD MI300X
---
# 🧬 OncoAgent β€” Multi-Agent Oncology Triage System
![ROCm](https://img.shields.io/badge/AMD-ROCm_7.2-ed1c24?logo=amd&logoColor=white)
![Python](https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white)
![vLLM](https://img.shields.io/badge/vLLM-PagedAttention-000000?logo=vllm&logoColor=white)
![LangGraph](https://img.shields.io/badge/Orchestration-LangGraph-FF4F00?logo=langchain&logoColor=white)
![Gradio](https://img.shields.io/badge/UI-Gradio_6-FF7C00?logo=gradio&logoColor=white)
> **AMD Developer Hackathon 2026** Β· Powered by AMD Instinctβ„’ MI300X Β· ROCm 7.2
## 🌍 100% Open-Source: Democratizing Oncology
OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:
- **Guarantee Patient Privacy:** Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
- **Foster Global Contribution:** Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.
OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat **unstructured data blindness** in primary care oncology. It leverages a tier-adaptive architecture featuring **Qwen 3.5-9B** (Speed Triage) and **Qwen 3.6-27B** (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.
---
## πŸ—οΈ Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Router │──▢│Ingestion│──▢│Corrective│──▢│ Specialist │◀────│ Critic β”‚ β”‚ Formatterβ”‚
β”‚(Triage)β”‚ β”‚ (PHI) β”‚ β”‚ RAG β”‚ β”‚ (Qwen 9B/ β”‚ β”‚(Reflexion β”‚ β”‚(Output) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ 27B) │────▢│ Validation)β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–²
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό β–Ό β–Ό β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Fallback Node β”‚ β”‚ HITL Gate β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚(Acuity Chk)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
**Key Components:**
| Module | Description |
|--------|-------------|
| `data_prep/` | Dataset builder: PMC-Patients/OncoCoT β†’ Strict JSONL (Llama 3 chat template) |
| `rag_engine/` | The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization. |
| `agents/` | The "Reasoning": LangGraph multi-agent orchestration (Router β†’ Corrective RAG β†’ Specialist ↔ Critic β†’ HITL Gate). |
| `ui/` | The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output. |
---
## 🧠 Dual-Tier Model Strategy (Qwen)
To maximize the compute capabilities of the **AMD MI300X**, OncoAgent implements a dynamic **Dual-Tier** routing strategy using the Qwen model family. **Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types** (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:
- **Tier 1: Qwen 3.5-9B (Speed Triage):** A lightweight, extremely fast model used by the `Router` to assess initial complexity, perform simple triage, and handle low-risk queries.
- **Tier 2: Qwen 3.6-27B (Deep Reasoning):** The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.
---
## ⚑ Hardware Target
- **GPU:** AMD Instinctβ„’ MI300X (192GB HBM3)
- **Software Stack:** ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
- **Models:** `Qwen/Qwen3.5-9B` (Speed Triage) & `Qwen/Qwen3.6-27B-Instruct` (Deep Reasoning)
- **Precision:** QLoRA 4-bit NormalFloat4 via `bitsandbytes` (ROCm compatible)
---
## πŸš€ Quick Start
```bash
# 1. Clone and setup
git clone <repo-url>
cd OncoAgent
# 2. Install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# 3. Start Inference Server (vLLM on Docker)
# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
--model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1
# 4. Configure environment & Run UI
cp .env.example .env
# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
python -m ui.app
```
---
## πŸ“ Project Structure
```
β”œβ”€β”€ docs/ # Documentation & research
β”‚ β”œβ”€β”€ research/ # Deep Research analysis documents
β”‚ β”œβ”€β”€ ADR/ # Architectural Decision Records
β”‚ β”œβ”€β”€ oncoagent_master_directive.md
β”‚ └── antigravity_rules.md
β”œβ”€β”€ data_prep/ # Dataset preparation (Fase 0)
β”œβ”€β”€ rag_engine/ # RAG ingestion & retrieval (Fase 0-3)
β”œβ”€β”€ agents/ # LangGraph orchestration (Fase 3)
β”œβ”€β”€ ui/ # Gradio frontend (Fase 4)
β”œβ”€β”€ tests/ # Unit & integration tests
β”œβ”€β”€ scripts/ # Utility scripts
β”œβ”€β”€ logs/ # Paper log & social media log
β”œβ”€β”€ requirements.txt # Pinned dependencies
└── Dockerfile # HF Spaces deployment
```
---
## 🩺 Safety Guarantees
- **Reflexion-based Critic Loop:** A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
- **Human-In-The-Loop (HITL) Gate:** An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
- **Corrective RAG:** The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
- **Zero-PHI:** Regex-based PII redaction before any processing
- **Reproducibility:** Fixed seeds (`torch.manual_seed(42)`) across all ML scripts
---
## πŸ“„ License
This project was built for the AMD Developer Hackathon 2026.
---
## πŸ‘₯ Team
Built with ❀️ and AMD Instinct MI300X.

Xet Storage Details

Size:
7.71 kB
Β·
Xet hash:
b57b0bcfe4a4d266ce9558a0ddc9977ccb4f801ec13955d79d5ea424645462a4

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.