Topic: "speech-separation"
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language: Python - Size: 97.8 MB - Last synced at: 6 days ago - Pushed at: 11 days ago - Stars: 9,723 - Forks: 1,476

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.12 GB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9,013 - Forks: 2,250

modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Language: Python - Size: 259 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,651 - Forks: 209

asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
Language: Python - Size: 5.88 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 2,362 - Forks: 431

coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Size: 139 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 1,318 - Forks: 142

maum-ai/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Language: Python - Size: 1.14 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 1,035 - Forks: 227

JusperLee/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
Size: 97.7 KB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 756 - Forks: 137

kaituoxu/Conv-TasNet
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
Language: Python - Size: 1.23 MB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 697 - Forks: 156

Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language: Python - Size: 892 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 552 - Forks: 156

anicolson/DeepXi
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Language: MATLAB - Size: 497 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 463 - Forks: 119

gemengtju/Tutorial_Separation
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Language: MATLAB - Size: 74.6 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 459 - Forks: 95

microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
Language: Python - Size: 72.4 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 455 - Forks: 74

JusperLee/Dual-Path-RNN-Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
Language: Python - Size: 94.7 KB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 434 - Forks: 66

funcwj/setk
Tools for Speech Enhancement integrated with Kaldi
Language: Python - Size: 36.3 MB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 410 - Forks: 91

double22a/speech_dataset
The dataset of Speech Recognition
Size: 70.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 392 - Forks: 76

posenhuang/deeplearningsourceseparation
Deep Recurrent Neural Networks for Source Separation
Language: MATLAB - Size: 500 MB - Last synced at: 22 days ago - Pushed at: almost 4 years ago - Stars: 368 - Forks: 133

speechbrain/speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Language: HTML - Size: 46.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 364 - Forks: 29

JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement
Language: Python - Size: 75.2 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 336 - Forks: 68

seanwood/gcc-nmf
Real-time GCC-NMF Blind Speech Separation and Enhancement
Language: Python - Size: 43.2 MB - Last synced at: 21 days ago - Pushed at: about 6 years ago - Stars: 318 - Forks: 134

AppleHolic/source_separation
Deep learning based speech source separation using Pytorch
Language: Jupyter Notebook - Size: 4.11 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 316 - Forks: 46

aishoot/LSTM_PIT_Speech_Separation
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
Language: Jupyter Notebook - Size: 7.38 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 308 - Forks: 90

tky823/DNN-based_source_separation
A PyTorch implementation of DNN-based source separation.
Language: Python - Size: 293 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 297 - Forks: 51

etzinis/sudo_rm_rf
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
Language: Jupyter Notebook - Size: 21 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 266 - Forks: 30

funcwj/conv-tasnet
A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" (see recipes in aps framework https://github.com/funcwj/aps)
Language: Python - Size: 181 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 198 - Forks: 60

eesungkim/Speech_Enhancement_DNN_NMF
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Language: Python - Size: 18.5 MB - Last synced at: 26 days ago - Pushed at: about 6 years ago - Stars: 184 - Forks: 61

meokz/looking-to-listen
Deep neural network (DNN) for noise reduction, removal of background music, and speech separation
Language: Python - Size: 33 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 170 - Forks: 19

KyleZhang1118/Voice-Separation-and-Enhancement
A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.
Language: MATLAB - Size: 35.5 MB - Last synced at: 22 days ago - Pushed at: over 3 years ago - Stars: 156 - Forks: 35

JusperLee/Looking-to-Listen-at-the-Cocktail-Party
Executable code based on Google articles
Language: Python - Size: 81.5 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 149 - Forks: 43

funcwj/aps
A personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Language: Python - Size: 108 MB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 142 - Forks: 28

JusperLee/Deep-Clustering-for-Speech-Separation
Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
Language: Python - Size: 94.7 KB - Last synced at: 26 days ago - Pushed at: almost 5 years ago - Stars: 131 - Forks: 24

funcwj/deep-clustering
deep clustering method for single-channel speech separation
Language: Python - Size: 23.4 KB - Last synced at: 26 days ago - Pushed at: almost 3 years ago - Stars: 109 - Forks: 34

funcwj/uPIT-for-speech-separation
Speech separation with utterance-level PIT experiments
Language: Python - Size: 38.1 KB - Last synced at: 26 days ago - Pushed at: almost 7 years ago - Stars: 104 - Forks: 39

kaituoxu/TasNet
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
Language: Python - Size: 1.15 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 96 - Forks: 30

JusperLee/Calculate-SNR-SDR
Script to calculate SNR and SDR using python
Language: Python - Size: 8.79 KB - Last synced at: 26 days ago - Pushed at: almost 5 years ago - Stars: 90 - Forks: 26

funcwj/voice-filter
A unofficial Pytorch implementation of Google's VoiceFilter
Language: Python - Size: 4.67 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 80 - Forks: 21

cyrta/awesome-speech-enhancement
A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.
Size: 13.7 KB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 66 - Forks: 15

JusperLee/UtterancePIT-Speech-Separation
According to funcwj's uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.
Language: Python - Size: 34.2 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 59 - Forks: 10

jacoxu/ASAM
This is the code&dataset for our paper [Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment. AAAI 2018]
Language: Python - Size: 49.1 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 54 - Forks: 20

anton-jeran/MULTI-AUDIODEC
This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.
Language: Python - Size: 7.41 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 46 - Forks: 6

Totoketchup/Adaptive-MultiSpeaker-Separation
Adaptive and Focusing Neural Layers for Multi-Speaker Separation Problem
Language: Jupyter Notebook - Size: 18 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 45 - Forks: 18

JusperLee/Deep-Encoder-Decoder-Conv-TasNet
A PyTorch implementation of " AN EMPIRICAL STUDY OF CONV-TASNET "
Language: Python - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 34 - Forks: 9

hangtingchen/Beam-Guided-TasNet
Beam-guided TasNet
Language: Python - Size: 23.3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 33 - Forks: 7

xuchenglin28/speech_separation
Constrained Permutation Invariant Training, Speech Separation
Language: Python - Size: 1.13 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 33 - Forks: 9

haoxiangsnr/SpEx
Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".
Language: Python - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 26 - Forks: 8

chimechallenge/chime-utils
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
Language: Python - Size: 2.63 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 21 - Forks: 3

JusperLee/DANet-For-Speech-Separation
Pytorch implement of DANet For Speech Separation
Language: Python - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 18 - Forks: 4

khanld/Dynamic-Mixing
Dynamic Mixing For Speech Processing (mix-on-the-fly)
Language: Python - Size: 12.5 MB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 2

chentuochao/Sound_Bubble
Project for speech bubble
Language: Python - Size: 12.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 14 - Forks: 2

mborsdorf/UniversalSpeakerExtraction
Language: Python - Size: 9.42 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 4

HeliosX7/voice-filter
Unofficial Tensorflow/Keras implementation of Google AI VoiceFilter
Language: Jupyter Notebook - Size: 3.69 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3

anicolson/bidirectional_2018
A Deep Learning Approach to Ideal Binary Mask Estimation
Size: 8.8 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 9 - Forks: 3

mcw519/PureSound
Make the sound you hear pure and clean by deep learning.
Language: Python - Size: 138 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 8 - Forks: 0

hmartelb/avlit
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Language: Python - Size: 422 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

dacson/Demo-of-Speech-Separation
single channel speech separation for music vocal and accompany separate、voice reduce noise
Size: 151 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 8 - Forks: 5

kwatcharasupat/directional-sparse-filtering-tf
Python Implementation for Directional Sparse Filtering with Tensorflow/Keras
Language: Python - Size: 21.5 KB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 1

e13000/directional_sparse_filtering
Directional sparse filtering for blind speech separation
Language: MATLAB - Size: 16.1 MB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 3

ZhaZhaFon/demo-confusion
This is a demo for our paper 'Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches'
Size: 35.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 6

mborsdorf/TargetLanguageExtraction
Size: 21.5 KB - Last synced at: 11 months ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

mborsdorf/GlobalPhoneMS_Scripts
Language: MATLAB - Size: 5.44 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

shun60s/Blind-Speech-Separation
U-Netによる音楽と音声のミックス信号(モノラル)からの音声の分離
Language: Python - Size: 28.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 2

lukereichold/visual-speech-separation
Flask app to demo multimodal deep learning speech separation in videos via TensorFlow Serving
Language: Python - Size: 20.8 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

ZhaZhaFon/demo-samom
This is a demo for our paper 'Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction'.
Size: 4.08 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

ooshyun/Speech-Enhancement-Pytorch
Pytorch Models for Speech Enhancement
Language: Python - Size: 3.34 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

ZhaZhaFon/demo-speakerseparation
This is a demo for my bachelor thesis 'Speaker Separation and Machine Auditory Perception for Dialogue Scene'.
Language: Shell - Size: 2.91 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

ZitengWang/uPIT-for-speech-separation Fork of funcwj/uPIT-for-speech-separation
target speaker separation using a short adaptation utterance
Language: Python - Size: 36.1 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 1

sayemomer/Speech-Separations-with-variable-number-of-sources
Audio source separation model with a Whisper ECAPA-TDNN counter and pre‑trained speechbrain/sepformer-libri3mix and speechbrain/sepformer-wsj02mix for speech separation, implemented with SpeechBrain.
Language: Jupyter Notebook - Size: 4.88 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

NikhilC2209/AVSpeech_Sep
Thesis project for Speech Separation using Deep Learning
Language: Jupyter Notebook - Size: 32 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dangvansam/pyannote-onnx Fork of pyannote/pyannote-audio
PyAnnote with ONNX model
Language: Jupyter Notebook - Size: 273 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Orelbenr/acoustic-fencing
Acoustic Fence Using Multi-Microphone Speaker Separation
Language: Python - Size: 8.16 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RishiKakade/Speech-Separating-Hearing-Aid
Language: JavaScript - Size: 10.7 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ZhaZhaFon/repo_asteroid Fork of asteroid-team/asteroid
语音前端仓库 || a modified version of Asteroid toolkit for Speech Front-end
Language: Python - Size: 5.68 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

SouppuoS/Multi-round-record
a simple implement for multi-round recordings
Language: Python - Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0
