Rootal-Twi-ASR / README.md

rootabytes

Update README.md

253b519 verified about 1 month ago

preview code

raw

history blame contribute delete

2.84 kB

metadata

language:
  - ak
license: cc-by-nc-4.0
base_model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment
tags:
  - whisper
  - lora
  - asante-twi
  - akan
  - speech-recognition
  - peft
datasets:
  - mozilla-foundation/common_voice_11_0
  - michsethowusu/twi_multispeaker_audio_transcribed

Whisper Large v3 Turbo — Asante Twi LoRA Adapter (R14)

Fine-tuned LoRA adapter for Asante Twi automatic speech recognition, built on top of katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment.

WER: 17.5% on LVP held-out eval set (Pilot-ready threshold: <22%)

Training Data

Dataset	Role	Notes
LVP real recordings (private)	Training + eval	Collected via Rootal Audio Annotation Platform @rootal.ai; available on request
LVP synthetic QA (private)	Training	TTS-generated Twi Q&A pairs
Common Voice Akan	Training	Mozilla CC0
Financial Inclusion Speech Dataset (Ashesi)	Training (200 samples)	See citation below
michsethowusu/twi_multispeaker_audio_transcribed	Eval-only diagnostic	Excluded from training — transcription style mismatch

Training Configuration

Base model: katrintomanek/whisper-large-v3-turbo_Akan_standardspeech_specaugment
LoRA: rank=32, alpha=64, targets: q/k/v/out_proj + fc1/fc2
Language: None (Twi not in Whisper vocab — no language prefix token)
Anti-hallucination: condition_on_prev_tokens=False, repetition_penalty=1.2
Quantization: 8-bit (BitsAndBytes)

Citation

If you use this adapter, please cite:

@misc{aguyatimothy2025asantetwi,
  author    = {Timothy Aguya, Akasiya},
  title     = {Whisper Large v3 Turbo — Asante Twi LoRA Adapter},
  year      = {2026},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/rootabytes/whisper-large-v3-turbo-asante-twi-lvp}
}

@misc{financialinclusion2022,
  author    = {Asamoah Owusu, D. and Korsah, A. and Quartey, B. and Nwolley Jnr., S.
               and Sampah, D. and Adjepon-Yamoah, D. and Omane Boateng, L.},
  title     = {Financial Inclusion Speech Dataset},
  year      = {2022},
  publisher = {Ashesi University and Nokwary Technologies},
  url       = {https://github.com/Ashesi-Org/Financial-Inclusion-Speech-Dataset}
}

@inproceedings{ardila2020common,
  title     = {Common Voice: A Massively-Multilingual Speech Corpus},
  author    = {Ardila, Rosana and others},
  booktitle = {LREC},
  year      = {2020}
}

@article{radford2022robust,
  title   = {Robust Speech Recognition via Large-Scale Weak Supervision},
  author  = {Radford, Alec and others},
  journal = {arXiv:2212.04356},
  year    = {2022}
}

@article{hu2021lora,
  title   = {LoRA: Low-Rank Adaptation of Large Language Models},
  author  = {Hu, Edward J and others},
  journal = {arXiv:2106.09685},
  year    = {2021}
}