Instructions to use BRlin/Breeze-ASR-25-mlx-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use BRlin/Breeze-ASR-25-mlx-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Breeze-ASR-25-mlx-bf16 BRlin/Breeze-ASR-25-mlx-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Breeze-ASR-25 — MLX bfloat16
This is an MLX-format conversion of MediaTek-Research/Breeze-ASR-25 in
bfloat16, suitable for Apple Silicon inference via mlx-audio and
downstream tools like oMLX.
Breeze ASR 25 is fine-tuned from Whisper-large-v2 with a focus on Taiwanese Mandarin and Mandarin–English code-switching, including both intra-sentential and inter-sentential switches. It also has improved time alignment for captioning workflows.
Conversion
Produced from the upstream PyTorch checkpoint with mlx-audio 0.4.3:
python -m mlx_audio.convert \
--hf-path MediaTek-Research/Breeze-ASR-25 \
--mlx-path ./Breeze-ASR-25-mlx-bf16 \
--dtype bfloat16 \
--model-domain stt
The output directory contains the full HF processor surface
(tokenizer.json, preprocessor_config.json, generation_config.json,
etc.), unlike older legacy MLX Whisper conversions that only shipped
config.json + weights.safetensors and could not be loaded by
modern WhisperProcessor-based stacks. If you have a legacy directory
to upgrade in place, see mlx-whisper-legacy-fixup.
Usage
Via mlx-audio
from mlx_audio.stt.generate import generate_transcription
result = generate_transcription(
model="BRlin/Breeze-ASR-25-mlx-bf16",
audio="path/to/audio.wav",
language="zh",
)
print(result.text)
Via oMLX HTTP endpoint
curl -X POST http://127.0.0.1:8868/v1/audio/transcriptions \
-H "Authorization: Bearer $OMLX_KEY" \
-F "file=@path/to/audio.m4a" \
-F "model=Breeze-ASR-25-mlx-bf16" \
-F "language=zh"
Specs
| Field | Value |
|---|---|
| Architecture | Whisper large-v2 (fine-tuned) |
| Parameters | ~1.55 B |
| Precision | bfloat16 |
| n_mels | 80 |
| Vocab size | 51865 |
| File size | ~2.9 GB |
| Sample rate | 16 kHz mono |
| Best for | Traditional Chinese (Taiwanese Mandarin) + zh/en code-switching |
License & Attribution
Inherits Apache 2.0 from the upstream Breeze-ASR-25 model. All credit for the underlying weights belongs to MediaTek Research — this repository only provides the MLX format conversion for convenience on Apple Silicon.
If you use this model, please cite the upstream:
- Breeze-ASR-25 paper: https://arxiv.org/pdf/2506.11130
- GitHub: https://github.com/mtkresearch/Breeze-ASR-25
- OpenAI Whisper: https://github.com/openai/whisper
Maintenance
This is a one-shot conversion at 2026-04 and is not actively maintained beyond bug fixes. If the upstream Breeze-ASR-25 model is updated, please re-run the conversion command above, or open an issue.
- Downloads last month
- 53
Quantized
Model tree for BRlin/Breeze-ASR-25-mlx-bf16
Base model
openai/whisper-large-v2