How to use microsoft/wavlm-base-plus-sd with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForAudioFrameClassification processor = AutoProcessor.from_pretrained("microsoft/wavlm-base-plus-sd") model = AutoModelForAudioFrameClassification.from_pretrained("microsoft/wavlm-base-plus-sd")
Thanks for the job. I am searching for a tool to extract speaker‘s embedding from speech as his voiceprint. Is this model's output the voiceprint I want?
· Sign up or log in to comment