Qwen3 EAGLE3 — Weighted Loss Variants
Collection
Qwen3-8B draft models collection for CMU 11-711 Course Project • 7 items • Updated
Our method. Per-step loss weighted by a Process Reward Model score: w_s = 1 + γ · PRM', γ=1.0.
Part of a course project evaluating per-step weighted loss functions for training EAGLE3 draft models. Full pipeline and source: https://github.com/XLOverflow/anlp_course_project
Collection: Qwen3 EAGLE3 — Weighted Loss Variants
Qwen/Qwen3-8BAngelSlim/Qwen3-8B_eagle3scripts/data/ in project repo)Qwen/Qwen2.5-Math-PRM-7B (see scripts/data/prm_score.py)baseline-uniform/epoch_4_step_82000| Dataset | τ (accept. length) | Speedup | Accuracy |
|---|---|---|---|
| GSM8K | 7.354 | 4.473× | 95.38% |
| MATH500 | 7.319 | 4.612× | 94.40% |
Baselines for reference: Vanilla ≈ 1× speedup, EAGLE-orig ≈ 2× speedup.
model.safetensors — draft model weights (~763 MB)config.json — model configoutputs/eagle3-prm-g1.0/epoch_0_step_17026 in the original training outputOptimizer state (~3 GB) is not uploaded — use the project repo's training scripts to resume from scratch if needed.
from huggingface_hub import snapshot_download
draft_path = snapshot_download(repo_id="XLOverflow/qwen3-eagle3-prm-g1.0")
# Then load with EAGLE's EaModel — see scripts/eval/eval_combined.py in the project repo.
Base model
AngelSlim/Qwen3-8B_eagle3