Whisper Malayalam (Fine-tuned on Common Voice 11.0)

This is a fine-tuned Whisper model for Malayalam Automatic Speech Recognition (ASR). It was trained on the Common Voice 11.0 Malayalam dataset (train+validation splits). The model is capable of transcribing Malayalam speech into text.

Model Details

Model Description

  • Model Type: Whisper (fine-tuned)
  • Language: Malayalam (ml)
  • Base Model: OpenAI Whisper
  • Dataset Used: mozilla-foundation/common_voice_11_0
  • Training Splits: train + validation
  • Task: Automatic Speech Recognition (ASR)
  • License: Apache 2.0 (inherits from Whisper)

Model Sources

  • Hugging Face Repository: [More Information Needed]
  • Paper [Optional]: [More Information Needed]
  • Demo [Optional]: [More Information Needed]

Usage

You can use this model for transcribing Malayalam speech into text.

Example Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import torchaudio

model_name = "Jithjacob123/whisper-small-Malayalam"
processor = WhisperProcessor.from_pretrained(model_name)
model = WhisperForConditionalGeneration.from_pretrained(model_name).to("cuda")

# Load an audio file
waveform, sample_rate = torchaudio.load("sample_audio.wav")

# Preprocess audio
input_features = processor(waveform, sampling_rate=sample_rate, return_tensors="pt").input_features

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(input_features)

# Decode output
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription)
Downloads last month
6
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support