You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Revolab VITS — Multi-Speaker Bahasa Melayu TTS

VITS voice models trained on Revolab Malay speech datasets.

Speakers (2 production-quality)

Speaker ID	Name	Samples	CER	WER
sarah	sarah	27,792	0.0537	0.1835
paan	Paan	27,434	0.0681	0.1561

All speakers evaluated at CER < 10% (production quality).

Structure

speakers.json              # Speaker registry with eval metrics
speakers/
  <name>/
    model.onnx             # ONNX export for inference
    model.onnx.json        # Phoneme config

Performance (CPU)

Metric	Value
Avg latency	~54ms
Avg RTF	0.030
Speed	33.6x realtime

Training

All models trained with:

Architecture: VITS
Phonemizer: espeak-ng (ms voice)
Sample rate: 22050Hz
GPU: NVIDIA H200 NVL

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support