Mega-ASR-6bit

6-bit quantized robust-merged variant of Mega-ASR, in MLX format, for mlx-audio.

No router — always-on robust. The Mega-ASR robustness LoRA is merged into the Qwen3-ASR-1.7B base and then quantized, so the per-utterance clean/degraded router is not present (you cannot add fp32 LoRA deltas to quantized weights). This model always runs the robust path.

For the full dynamic Mega-ASR — clean speech on the base path, noisy speech on the LoRA path — use mlx-community/Mega-ASR-bf16.

Use this 6-bit variant for noisy-only / memory-constrained deployments: ~2 GB and ~4× faster than the dynamic model (no per-clip LoRA toggling).

Use with mlx-audio

pip install mlx-audio

from mlx_audio.stt import load

model = load("mlx-community/Mega-ASR-6bit")
result = model.generate("audio.wav", language="en")
print(result.text)

Quality

6-bit is effectively lossless versus bf16 on noisy speech. WER on a NOIZEUS subset (merged-robust path):

Precision	overall WER	size
bf16	7.95	4.08 GB
6-bit (this model)	7.89	2.04 GB
8-bit	8.06	2.47 GB

(4-bit degrades to 10.78 WER and is not published.)

License & attribution

Apache-2.0. Built on zhifeixie/Mega-ASR (adapter + router) and Qwen/Qwen3-ASR-1.7B (base).

Downloads last month: -

Safetensors

Model size

0.7B params

Tensor type

BF16

U32

MLX

Hardware compatibility

6-bit

Model tree for mlx-community/Mega-ASR-6bit

Base model

Qwen/Qwen3-ASR-1.7B

Quantized

(33)

this model