Breeze-ASR-25 — MLX bfloat16

This is an MLX-format conversion of MediaTek-Research/Breeze-ASR-25 in bfloat16, suitable for Apple Silicon inference via mlx-audio and downstream tools like oMLX.

Breeze ASR 25 is fine-tuned from Whisper-large-v2 with a focus on Taiwanese Mandarin and Mandarin–English code-switching, including both intra-sentential and inter-sentential switches. It also has improved time alignment for captioning workflows.

Conversion

Produced from the upstream PyTorch checkpoint with mlx-audio 0.4.3:

python -m mlx_audio.convert \
  --hf-path MediaTek-Research/Breeze-ASR-25 \
  --mlx-path ./Breeze-ASR-25-mlx-bf16 \
  --dtype bfloat16 \
  --model-domain stt

The output directory contains the full HF processor surface (tokenizer.json, preprocessor_config.json, generation_config.json, etc.), unlike older legacy MLX Whisper conversions that only shipped config.json + weights.safetensors and could not be loaded by modern WhisperProcessor-based stacks. If you have a legacy directory to upgrade in place, see mlx-whisper-legacy-fixup.

Usage

Via mlx-audio

from mlx_audio.stt.generate import generate_transcription

result = generate_transcription(
    model="BRlin/Breeze-ASR-25-mlx-bf16",
    audio="path/to/audio.wav",
    language="zh",
)
print(result.text)

Via oMLX HTTP endpoint

curl -X POST http://127.0.0.1:8868/v1/audio/transcriptions \
  -H "Authorization: Bearer $OMLX_KEY" \
  -F "file=@path/to/audio.m4a" \
  -F "model=Breeze-ASR-25-mlx-bf16" \
  -F "language=zh"

Specs

Field	Value
Architecture	Whisper large-v2 (fine-tuned)
Parameters	~1.55 B
Precision	bfloat16
n_mels	80
Vocab size	51865
File size	~2.9 GB
Sample rate	16 kHz mono
Best for	Traditional Chinese (Taiwanese Mandarin) + zh/en code-switching

License & Attribution

Inherits Apache 2.0 from the upstream Breeze-ASR-25 model. All credit for the underlying weights belongs to MediaTek Research — this repository only provides the MLX format conversion for convenience on Apple Silicon.

If you use this model, please cite the upstream:

Breeze-ASR-25 paper: https://arxiv.org/pdf/2506.11130
GitHub: https://github.com/mtkresearch/Breeze-ASR-25
OpenAI Whisper: https://github.com/openai/whisper

Maintenance

This is a one-shot conversion at 2026-04 and is not actively maintained beyond bug fixes. If the upstream Breeze-ASR-25 model is updated, please re-run the conversion command above, or open an issue.

Downloads last month: 53

Safetensors

Model size

2B params

Tensor type

F16

MLX

Hardware compatibility

Quantized

Model tree for BRlin/Breeze-ASR-25-mlx-bf16

Base model

openai/whisper-large-v2

Finetuned

MediaTek-Research/Breeze-ASR-25

Finetuned

(17)

this model

Paper for BRlin/Breeze-ASR-25-mlx-bf16

A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data

Paper • 2506.11130 • Published Jun 10, 2025 • 5