GitHub topics: wavlm
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language: Python - Size: 6.23 MB - Last synced at: about 17 hours ago - Pushed at: 5 days ago - Stars: 904 - Forks: 138

yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language: Python - Size: 131 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 5,737 - Forks: 551

s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language: Python - Size: 135 MB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 2,396 - Forks: 501

SmoothKen/knn-svc
kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization
Language: Python - Size: 874 KB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

bunyaminergen/WavLMMSDD
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.
Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: 12 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 3

lucadellalib/audiocodecs
A collections of audio codecs with a standardized API
Language: Python - Size: 805 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 2

lucadellalib/focalcodec
A low-bitrate single-codebook 16 kHz speech codec based on focal modulation
Language: Python - Size: 7.17 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 83 - Forks: 10

bunyaminergen/WavLMRawNetXSVBase
WavLM Large + RawNetX Speaker Verification Base: End-to-End Speaker Verification Architecture
Language: Python - Size: 1.17 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 3 - Forks: 0

lucadellalib/discrete-wavlm-codec
A neural speech codec based on discrete WavLM representations
Language: Python - Size: 403 KB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 23 - Forks: 3

theolepage/wavlm_ssl_sv
SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.
Language: Python - Size: 466 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

aitor-alvarez/acoustic-transformer-models
Acoustic Transformer Models for Audio Classification
Language: Python - Size: 51.8 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

mjhydri/Singing-Vocal-Beat-Tracking
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and tempo.
Language: Python - Size: 122 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 4

zhu00121/Universal-representation-dynamics-of-deepfake-speech
This repo contains code used in the paper "Characterizing the temporal dynamics of universal speech representations for generalizable deepfake detection"
Language: Python - Size: 339 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

lucadellalib/cryceleb2023
CryCeleb2023 experiments
Language: Jupyter Notebook - Size: 4.88 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Sarasadeghii/Sharif-WavLM
In this repository, the wavLM model is used for quality and poor quality data for speaker verification task, and the PyCM library is used for evaluation.
Language: Jupyter Notebook - Size: 744 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

alessandropec/data_driven_ai_voice_cloning
This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering
Language: Python - Size: 268 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 1
