xcodec2-25TPS-24k
Improve https://huggingface.co/HKUSTAudio/xcodec2 from 50 TPS to become 25 TPS and upscale output to 24k sample rate.
WanDB at https://wandb.ai/huseinzol05/xcodec2-24k-25tps, we also pushed all checkpoints in checkpoint.
Dataset
- https://huggingface.co/datasets/malaysia-ai/common_voice_17_0, train set only.
- https://huggingface.co/datasets/mesolitica/Malaysian-STT-Whisper-Stage2, except
noiseandaudioset_0.5s. - https://huggingface.co/datasets/malaysia-ai/Multilingual-TTS, specific commit 2421a13e07226d96ac7009d5327d96a84672768c except
cml-ttsandlibritts_r_filtered - https://huggingface.co/datasets/mesolitica/Malaysian-Emilia-v2, only
sg_podcastandmalaysian_podcast
How to
- Git clone,
git clone https://github.com/malaysia-ai/X-Codec-2.0-25TPS-24k
cd X-Codec-2.0-25TPS-24k
- Load the model,
from modeling_xcodec2 import XCodec2Model
model = XCodec2Model.from_pretrained("malaysia-ai/xcodec2-25TPS-24k")
- Encode,
import librosa
y, sr = librosa.load('259041.mp3', sr = 16000)
wav_tensor = torch.from_numpy(y).float().unsqueeze(0)
codes = model.encode_code(wav_tensor)
- Decode,
import IPython.display as ipd
ipd.Audio(model.decode_code(codes)[0, 0].cpu(), rate = 24000)
Source code
Source code at https://github.com/malaysia-ai/X-Codec-2.0-25TPS-24k
- Downloads last month
- 22
Model tree for malaysia-ai/xcodec2-25TPS-24k
Base model
HKUSTAudio/xcodec2