YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A piper voice model of the "Wheatley" character from the Portal 2 video game.
The actual voice actor is Stephen Merchant
Trained for ~3hours using fine tuning based on an existing model, using piper piper_multilingual_training_notebook.ipynb Training data was exported mp3 files from the original game, with transcripts generated by whisper, then manually checked/edited. I removed audio samples that might confuse the training (e.g. those including sound-effects or unintelligible noises).
There were several problems in creating this, overcoming those could result in a better model. Specifically:
- The training would not work unless I set the voice and starting model to en-US. But since this is an en-GB voice (specifically west country/Bristol UK), it is not the ideal training start point.
- I could not get google colab to remain training for more than about 3 hours before it timed out or failed for some other reason.
- I could not get the visualisation of the training to work using tensorboard. It just showed me a single dot on a graph. So it was hard to see whether the model was converging/improving/plateauing, etc.
- I could not get any of the 'resume' training to work on any models. So if it stopped, I had to start from the beginning (hence max ~3h)
But it is usable, and recogisable, so not bad for a first try!