metadata
language:
- ak
license: cc-by-nc-4.0
base_model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment
tags:
- whisper
- lora
- asante-twi
- akan
- speech-recognition
- peft
datasets:
- mozilla-foundation/common_voice_11_0
- michsethowusu/twi_multispeaker_audio_transcribed
Whisper Large v3 Turbo — Asante Twi LoRA Adapter (R14)
Fine-tuned LoRA adapter for Asante Twi automatic speech recognition, built on top of
katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment.
WER: 17.5% on LVP held-out eval set (Pilot-ready threshold: <22%)
Training Data
| Dataset | Role | Notes |
|---|---|---|
| LVP real recordings (private) | Training + eval | Collected via Rootal Audio Annotation Platform @rootal.ai; available on request |
| LVP synthetic QA (private) | Training | TTS-generated Twi Q&A pairs |
| Common Voice Akan | Training | Mozilla CC0 |
| Financial Inclusion Speech Dataset (Ashesi) | Training (200 samples) | See citation below |
| michsethowusu/twi_multispeaker_audio_transcribed | Eval-only diagnostic | Excluded from training — transcription style mismatch |
Training Configuration
- Base model:
katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment - LoRA: rank=32, alpha=64, targets: q/k/v/out_proj + fc1/fc2
- Language:
None(Twi not in Whisper vocab — no language prefix token) - Anti-hallucination:
condition_on_prev_tokens=False,repetition_penalty=1.2 - Quantization: 8-bit (BitsAndBytes)
Citation
If you use this adapter, please cite:
@misc{aguyatimothy2025asantetwi,
author = {Timothy Aguya, Akasiya},
title = {Whisper Large v3 Turbo — Asante Twi LoRA Adapter},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/rootabytes/whisper-large-v3-turbo-asante-twi-lvp}
}
@misc{financialinclusion2022,
author = {Asamoah Owusu, D. and Korsah, A. and Quartey, B. and Nwolley Jnr., S.
and Sampah, D. and Adjepon-Yamoah, D. and Omane Boateng, L.},
title = {Financial Inclusion Speech Dataset},
year = {2022},
publisher = {Ashesi University and Nokwary Technologies},
url = {https://github.com/Ashesi-Org/Financial-Inclusion-Speech-Dataset}
}
@inproceedings{ardila2020common,
title = {Common Voice: A Massively-Multilingual Speech Corpus},
author = {Ardila, Rosana and others},
booktitle = {LREC},
year = {2020}
}
@article{radford2022robust,
title = {Robust Speech Recognition via Large-Scale Weak Supervision},
author = {Radford, Alec and others},
journal = {arXiv:2212.04356},
year = {2022}
}
@article{hu2021lora,
title = {LoRA: Low-Rank Adaptation of Large Language Models},
author = {Hu, Edward J and others},
journal = {arXiv:2106.09685},
year = {2021}
}