Mega-ASR-6bit

6-bit quantized robust-merged variant of Mega-ASR, in MLX format, for mlx-audio.

No router — always-on robust. The Mega-ASR robustness LoRA is merged into the Qwen3-ASR-1.7B base and then quantized, so the per-utterance clean/degraded router is not present (you cannot add fp32 LoRA deltas to quantized weights). This model always runs the robust path.

For the full dynamic Mega-ASR — clean speech on the base path, noisy speech on the LoRA path — use mlx-community/Mega-ASR-bf16.

Use this 6-bit variant for noisy-only / memory-constrained deployments: ~2 GB and ~4× faster than the dynamic model (no per-clip LoRA toggling).

Use with mlx-audio

pip install mlx-audio
from mlx_audio.stt import load

model = load("mlx-community/Mega-ASR-6bit")
result = model.generate("audio.wav", language="en")
print(result.text)

Quality

6-bit is effectively lossless versus bf16 on noisy speech. WER on a NOIZEUS subset (merged-robust path):

Precision overall WER size
bf16 7.95 4.08 GB
6-bit (this model) 7.89 2.04 GB
8-bit 8.06 2.47 GB

(4-bit degrades to 10.78 WER and is not published.)

License & attribution

Apache-2.0. Built on zhifeixie/Mega-ASR (adapter + router) and Qwen/Qwen3-ASR-1.7B (base).

Downloads last month
-
Safetensors
Model size
0.7B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/Mega-ASR-6bit

Quantized
(33)
this model