Whisper Medium Fine-tuned on Nepali (OpenSLR 54)

This model is a fine-tuned version of openai/whisper-medium on the OpenSLR 54 (Nepali Speech Corpus) dataset.

Model Details

  • Model: Whisper Medium (769M Parameters)
  • Dataset: ~154 Hours of Nepali Audio
  • Language: Nepali
  • Hardware: NVIDIA A100 80GB

Results

  • WER: 14.04%
  • Loss: 0.0710

Usage

from transformers import pipeline

transcriber = pipeline(
    "automatic-speech-recognition", 
    model="Dragneel/whisper-medium-nepali-openslr"
)

text = transcriber("path_to_audio.mp3")
print(text["text"])
Downloads last month
8
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dragneel/whisper-medium-nepali-openslr

Finetuned
(768)
this model
Finetunes
1 model

Dataset used to train Dragneel/whisper-medium-nepali-openslr

Evaluation results