Topic: "speech-enhancement"
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language: Python - Size: 98 MB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 9,838 - Forks: 1,492

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.14 GB - Last synced at: 19 days ago - Pushed at: 22 days ago - Stars: 9,113 - Forks: 2,263

Rikorose/DeepFilterNet
Noise supression using deep filtering
Language: Python - Size: 171 MB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 3,093 - Forks: 289

modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Language: Python - Size: 259 MB - Last synced at: 17 days ago - Pushed at: about 1 month ago - Stars: 2,792 - Forks: 217

asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
Language: Python - Size: 5.88 MB - Last synced at: 18 days ago - Pushed at: 5 months ago - Stars: 2,378 - Forks: 433

resemble-ai/resemble-enhance
AI powered speech denoising and enhancement
Language: Python - Size: 23.4 KB - Last synced at: 17 days ago - Pushed at: 6 months ago - Stars: 1,782 - Forks: 205

haoheliu/voicefixer
General Speech Restoration
Language: Python - Size: 3.76 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 1,149 - Forks: 139

ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language: Python - Size: 18.2 MB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 1,078 - Forks: 81

JusperLee/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
Size: 48.8 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 779 - Forks: 137

nanahou/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Language: MATLAB - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 762 - Forks: 151

k2kobayashi/sprocket
Voice Conversion Tool Kit
Language: Python - Size: 1.75 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 600 - Forks: 115

Audio-WestlakeU/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Language: Python - Size: 892 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 552 - Forks: 156

breizhn/DTLN
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Language: Python - Size: 25.5 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 501 - Forks: 143

anicolson/DeepXi
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Language: MATLAB - Size: 497 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 463 - Forks: 119

double22a/speech_dataset
The dataset of Speech Recognition
Size: 74.2 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 413 - Forks: 77

funcwj/setk
Tools for Speech Enhancement integrated with Kaldi
Language: Python - Size: 36.3 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 410 - Forks: 91

schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
Language: Python - Size: 1.85 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 404 - Forks: 88

yxlu-0102/MP-SENet
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Language: Python - Size: 498 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 386 - Forks: 60

jzi040941/PercepNet
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
Language: C++ - Size: 31.6 MB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 344 - Forks: 94

shahules786/mayavoz
Pytorch based speech enhancement toolkit.
Language: Python - Size: 1.09 MB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 338 - Forks: 26

yongxuUSTC/sednn
deep learning based speech enhancement using keras or pytorch, make it easy to use
Language: Python - Size: 5.56 MB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 334 - Forks: 126

seanwood/gcc-nmf
Real-time GCC-NMF Blind Speech Separation and Enhancement
Language: Python - Size: 43.2 MB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 319 - Forks: 134

haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
Language: Python - Size: 43.9 KB - Last synced at: 7 months ago - Pushed at: almost 5 years ago - Stars: 315 - Forks: 58

aishoot/LSTM_PIT_Speech_Separation
Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.
Language: Jupyter Notebook - Size: 7.38 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 308 - Forks: 90

fgnt/pb_bss
Collection of EM algorithms for blind source separation of audio signals
Language: Python - Size: 635 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 286 - Forks: 61

haoheliu/voicefixer_main
General Speech Restoration
Language: Python - Size: 21.5 MB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 278 - Forks: 56

haoxiangsnr/Wave-U-Net-for-Speech-Enhancement
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Language: Python - Size: 511 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 278 - Forks: 64

Xiaobin-Rong/gtcrn
The official implementation of GTCRN, an ultra-lite speech enhancement model.
Language: Python - Size: 10.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 264 - Forks: 48

AkojimaSLP/Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
Language: Python - Size: 3.18 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 257 - Forks: 74

jtkim-kaist/Speech-enhancement
Deep neural network based speech enhancement toolkit
Language: MATLAB - Size: 187 MB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 213 - Forks: 62

huyanxin/phasen
A unofficial Pytorch implementation of Microsoft's PHASEN
Language: Python - Size: 2.07 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 198 - Forks: 47

echocatzh/MTFAA-Net
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
Language: Python - Size: 22.5 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 195 - Forks: 56

audiolabs/torch-pesq
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
Language: Python - Size: 5.61 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 188 - Forks: 15

sekiguchi92/SoundSourceSeparation
The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.
Language: Python - Size: 31.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 170 - Forks: 30

skirdey/voicerestore
VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration
Language: Python - Size: 4.97 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 160 - Forks: 17

claritychallenge/clarity
Clarity Challenge toolkit - software for building Clarity Challenge systems
Language: Python - Size: 56.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 158 - Forks: 59

KyleZhang1118/Voice-Separation-and-Enhancement
A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.
Language: MATLAB - Size: 35.5 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 157 - Forks: 35

funcwj/aps
A personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Language: Python - Size: 108 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 142 - Forks: 28

james34602/SpleeterRT
Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.
Language: C - Size: 46 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 138 - Forks: 16

helianvine/fdndlp
A speech dereverberation algorithm, also called wpe
Language: Python - Size: 1.2 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 136 - Forks: 58

madhavmk/Noise2Noise-audio_denoising_without_clean_training_data
Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.
Language: Jupyter Notebook - Size: 348 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 125 - Forks: 37

tech-podcasts/SpeechEnhancement
基于深度学习的语音增强工具(Speech Enhancement Tools Based on Deep Learning)
Language: Go - Size: 1.35 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 121 - Forks: 21

Audio-WestlakeU/RealMAN
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
Language: Python - Size: 62.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 119 - Forks: 13

haoxiangsnr/IRM-based-Speech-Enhancement-using-LSTM
Ideal Ratio Mask (IRM) Estimation based Speech Enhancement using LSTM
Language: Python - Size: 1.48 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 116 - Forks: 25

haamoon/mmtm
Implementation of CVPR 2020 paper "MMTM: Multimodal Transfer Module for CNN Fusion"
Language: Python - Size: 47.9 KB - Last synced at: 5 months ago - Pushed at: almost 5 years ago - Stars: 112 - Forks: 21

dansuh17/segan-pytorch
SEGAN pytorch implementation https://arxiv.org/abs/1703.09452
Language: Python - Size: 82 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 108 - Forks: 32

jkjaer/fastF0Nls
C++ and MATLAB code for fast and accurate fundamental frequency estimation
Language: C++ - Size: 5.02 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 100 - Forks: 24

RookieJunChen/Inter-SubNet
The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.
Language: Python - Size: 93.8 KB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 95 - Forks: 12

ConferencingSpeech/ConferencingSpeech2021
Conferencing Speech Challenge
Language: Python - Size: 3 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 94 - Forks: 33

haoheliu/torchsubband
Pytorch implementation of subband decomposition
Language: HTML - Size: 374 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 92 - Forks: 13

line/open-universe
Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.
Language: Python - Size: 6.42 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 91 - Forks: 10

haoxiangsnr/spiking-fullsubnet
Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.
Language: Python - Size: 154 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 90 - Forks: 15

mikeroyal/NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Language: Python - Size: 315 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

Xiaobin-Rong/deepvqe
An unofficial implementation of DeepVQE proposed by Microsoft Corp.
Language: Python - Size: 164 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 84 - Forks: 21

Speech-Interaction-Technology-Aalto-U/itsp
Introduction to Speech Processing
Language: Jupyter Notebook - Size: 254 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 82 - Forks: 15

vipchengrui/traditional-speech-enhancement
Spectral Subtraction, Wiener Filtering, MMSE
Language: MATLAB - Size: 39.8 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 81 - Forks: 34

Xiaobin-Rong/SEtrain
A training code template for DNN-based speech enhancement.
Language: Python - Size: 48.8 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 76 - Forks: 27

Takaaki-Saeki/ssl_speech_restoration
SelfRemaster: SSL Speech Restoration
Language: Python - Size: 1.34 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 75 - Forks: 6

a-n-rose/Python-Sound-Tool
SoundPy (alpha stage) is a research-based python package for speech and sound. Applications include deep-learning, filtering, speech-enhancement, audio augmentation, feature extraction and visualization, dataset and audio file conversion, and beyond.
Language: Jupyter Notebook - Size: 180 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 74 - Forks: 7

nglehuy/semetrics
Speech Enhancement Metrics (PESQ, CSIG, CBAK, COVL)
Language: MATLAB - Size: 289 KB - Last synced at: 6 days ago - Pushed at: almost 5 years ago - Stars: 74 - Forks: 13

jhauret/eben
Repo for source code of EBEN: Extreme Bandwidth Extension Network
Language: Python - Size: 40.7 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 73 - Forks: 9

Picovoice/koala
On-device noise suppression powered by deep learning
Language: Python - Size: 28.9 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 70 - Forks: 4

Andong-Li-speech/GaGNet
This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement", which was accepted by Elsevier Applied Acoustics.
Language: Python - Size: 127 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 67 - Forks: 8

cyrta/awesome-speech-enhancement
A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.
Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 67 - Forks: 15

eesungkim/Speech_Enhancement_MMSE-STSA
A statistical model-based Speech Enhancement Using MMSE-STSA
Language: Python - Size: 1.53 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 66 - Forks: 27

chenwj1989/python-speech-enhancement
a python library for speech enhancement
Language: Python - Size: 1.46 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 65 - Forks: 13

chanil1218/DCUnet.pytorch
Phase-Aware Speech Enhancement with Deep Complex U-Net
Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 64 - Forks: 22

DiegoLeon96/Neural-Speech-Dereverberation
Machine and Deep Learning models for speech dereverberation
Language: Python - Size: 14.2 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 63 - Forks: 15

auspicious3000/deepbeam
Deep learning based Speech Beamforming
Language: Jupyter Notebook - Size: 33.5 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 62 - Forks: 18

01Zhangbw/Speech-and-audio-papers-Top-Conference
It includes papers on speech&audio field. Now update: ICLR2025-2023, ICML2025-2023, NeurIPS2024-2023, ACMMM2024, AAAI2025-2024, ACL2025-2024, EMNLP2024, NAACL2025, IJCAI2024, ECCV2024
Size: 290 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 61 - Forks: 1

seorim0/DCCRN-with-various-loss-functions
DCCRN with various loss functions
Language: Python - Size: 27.6 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 59 - Forks: 17

Audio-WestlakeU/McNet
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
Language: Python - Size: 54.3 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 5

Xiaobin-Rong/TRT-SE
An example of a speech enhancement model deployed with TensorRT.
Language: Python - Size: 16.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 50 - Forks: 8

Andong-Li-speech/EaBNet
This is the repo of the manuscript "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement", which was submitted to ICASSP2022.
Language: Python - Size: 101 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 50 - Forks: 8

supikiti/PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC
Language: Python - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 47 - Forks: 10

anton-jeran/MULTI-AUDIODEC
This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.
Language: Python - Size: 7.41 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 46 - Forks: 6

archiki/Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.
Language: Python - Size: 141 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 46 - Forks: 10

ghunkins/Voice-Denoising-AN
A Conditional Generative Adverserial Network (cGAN) was adapted for the task of source de-noising of noisy voice auditory images. The base architecture is adapted from Pix2Pix.
Language: Python - Size: 224 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 46 - Forks: 8

jqi41/Pytorch-Tensor-Train-Network
Jun and Huck's PyTorch-Tensor-Train Network Toolbox
Language: Jupyter Notebook - Size: 306 MB - Last synced at: 8 months ago - Pushed at: about 2 years ago - Stars: 44 - Forks: 16

MartinMashalov/VoiceCloning
Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the YourTTS TTS model to clone and generate realistic audio waves
Language: Python - Size: 15.6 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 44 - Forks: 7

Audio-WestlakeU/RVAE-EM
Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]
Language: Python - Size: 55.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 42 - Forks: 4

cogmhear/avse_challenge Fork of claritychallenge/clarity
COG-MHEAR Audio-Visual Speech Enhancement Challenge
Language: Python - Size: 774 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 40 - Forks: 11

winddori2002/MANNER
MANNER: Multi-view Attention Network for Noise ERasure (Speech enhancement in time-domain)
Language: Python - Size: 2.02 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 40 - Forks: 7

haoxiangsnr/SNR-Based-Progressive-Learning-of-Deep-Neural-Network-for-Speech-Enhancement
Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."
Language: Python - Size: 33.2 KB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 38 - Forks: 11

yuzhouhe2000/OMLSA-IMCRA
Python implementation of OMLSA+IMCRA algorithm for speech enhancement.
Language: Python - Size: 8.08 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 16

ZitengWang/nn_mask
multichannel linear filters based on mask estimation neural networks for CHiME4
Language: Python - Size: 3.41 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 37 - Forks: 19

will-rice/denoisers
Simple PyTorch Denoisers for Waveform Audio
Language: Python - Size: 842 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 2

LXP-Never/TCNN
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain
Language: Python - Size: 398 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 35 - Forks: 5

jerrygood0703/speech-enhancement-WGAN
speech enhancement GAN on waveform/log-power-spectrum data using Improved WGAN
Language: Python - Size: 43.9 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 35 - Forks: 18

jhauret/vibravox
Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.
Language: Python - Size: 2.25 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 34 - Forks: 2

aim-qmul/sdx23-aimless
Source Separation training codebase for the Sound Demixing Challenge 2023.
Language: Python - Size: 342 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 34 - Forks: 0

AkojimaSLP/Frame-by-frame-closed-form-update-for-mask-based-adaptive-MVDR-beamforming
speech-enhacement
Language: Python - Size: 2.41 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 34 - Forks: 15

RusselZHANG/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement
This is the microphone array generalization investigation based on previous Narrow Band Deep Filtering methods.
Language: Python - Size: 131 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 33 - Forks: 7

hangtingchen/Beam-Guided-TasNet
Beam-guided TasNet
Language: Python - Size: 23.3 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 33 - Forks: 7

seorim0/DNN-based-Speech-Enhancement-in-the-frequency-domain
DNN-based SE in the frequency domain using Pytorch. You can test some state-of-the-art networks using T-F masking or spectral mapping method.
Language: Python - Size: 857 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 32 - Forks: 10

vipchengrui/MASG
microphone array speech generator (MASG) in room acoustic
Language: Jupyter Notebook - Size: 21.1 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 31 - Forks: 10

Ryuk17/noise-xorcist
Single Channel Speech Enhancement Methods and Toolbox
Language: Python - Size: 56.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 29 - Forks: 8

JaeBinCHA7/Nested-U-Net-based-Real-time-Speech-Enhancement-Mobile-App
Real-time speech enhancement mobile app using Nested U-Net
Language: Python - Size: 31.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 4

linksense/ConvolutionaNeuralNetworksToEnhanceCodedSpeech
In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves speech quality (PESQ) by up to 0.25 MOS-LQO points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec (AMR-WB). In a subjective CCR listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and obtains a clear preference of 1.77 CMOS points compared to G.711, even en par with uncoded speech.
Language: Python - Size: 597 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 27 - Forks: 11

Hguimaraes/SEWUNet
[Research] Monaural Speech Enhancement through Wave-U-Net (SEWUNet)
Language: Python - Size: 14.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 26 - Forks: 4
