Instructions to use mitchelldehaven/whisper-large-v2-ru with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mitchelldehaven/whisper-large-v2-ru with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="mitchelldehaven/whisper-large-v2-ru")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("mitchelldehaven/whisper-large-v2-ru") model = AutoModelForSpeechSeq2Seq.from_pretrained("mitchelldehaven/whisper-large-v2-ru") - Notebooks
- Google Colab
- Kaggle
No punctuation
Yes, this is expected. This model was trained on a Russian dataset that I had access to that had been preprocessed with a particular focus in mind. Thus, if I recall correctly, all punctuation is removed and all words are lower-cased. I'm not sure about the artifacts in words however.
effort - 🏆
result - 💩
So original whisper is just better lol..
If you need case and punctuation, then yes you should use the original v2 model, or the new v3 model.
In un-cased and non-punctuation contexts, this model will likely have a lower WER than the original v2 model, particularly in noisy environments. I'm unsure about the v3 model, as I haven't tested it for Russian, but I assume v3 would be better as it improved substantially on non-English languages.
Can you finetune to russian version 3?
Unfortunately I cannot, I do not have access to the compute resource I used for this any more.
