YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

A piper voice model of the "Wheatley" character from the Portal 2 video game.

The actual voice actor is Stephen Merchant

Trained for ~3hours using fine tuning based on an existing model, using piper piper_multilingual_training_notebook.ipynb Training data was exported mp3 files from the original game, with transcripts generated by whisper, then manually checked/edited. I removed audio samples that might confuse the training (e.g. those including sound-effects or unintelligible noises).

There were several problems in creating this, overcoming those could result in a better model. Specifically:

  • The training would not work unless I set the voice and starting model to en-US. But since this is an en-GB voice (specifically west country/Bristol UK), it is not the ideal training start point.
  • I could not get google colab to remain training for more than about 3 hours before it timed out or failed for some other reason.
  • I could not get the visualisation of the training to work using tensorboard. It just showed me a single dot on a graph. So it was hard to see whether the model was converging/improving/plateauing, etc.
  • I could not get any of the 'resume' training to work on any models. So if it stopped, I had to start from the beginning (hence max ~3h)

But it is usable, and recogisable, so not bad for a first try!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support