An open API service providing repository metadata for many open source software ecosystems.

Topic: "speech-enhancement"

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

Language: Python - Size: 98 MB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 9,838 - Forks: 1,492

espnet/espnet

End-to-End Speech Processing Toolkit

Language: Python - Size: 1.14 GB - Last synced at: 19 days ago - Pushed at: 22 days ago - Stars: 9,113 - Forks: 2,263

Rikorose/DeepFilterNet

Noise supression using deep filtering

Language: Python - Size: 171 MB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 3,093 - Forks: 289

modelscope/ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Language: Python - Size: 259 MB - Last synced at: 17 days ago - Pushed at: about 1 month ago - Stars: 2,792 - Forks: 217

asteroid-team/asteroid

The PyTorch-based audio source separation toolkit for researchers

Language: Python - Size: 5.88 MB - Last synced at: 18 days ago - Pushed at: 5 months ago - Stars: 2,378 - Forks: 433

resemble-ai/resemble-enhance

AI powered speech denoising and enhancement

Language: Python - Size: 23.4 KB - Last synced at: 17 days ago - Pushed at: 6 months ago - Stars: 1,782 - Forks: 205

haoheliu/voicefixer

General Speech Restoration

Language: Python - Size: 3.76 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 1,149 - Forks: 139

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Language: Python - Size: 18.2 MB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 1,078 - Forks: 81

JusperLee/Speech-Separation-Paper-Tutorial

A must-read paper for speech separation based on neural networks

Size: 48.8 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 779 - Forks: 137

nanahou/Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

Language: MATLAB - Size: 25.2 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 762 - Forks: 151

k2kobayashi/sprocket

Voice Conversion Tool Kit

Language: Python - Size: 1.75 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 600 - Forks: 115

Audio-WestlakeU/FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Language: Python - Size: 892 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 552 - Forks: 156

breizhn/DTLN

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

Language: Python - Size: 25.5 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 501 - Forks: 143

anicolson/DeepXi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

Language: MATLAB - Size: 497 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 463 - Forks: 119

double22a/speech_dataset

The dataset of Speech Recognition

Size: 74.2 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 413 - Forks: 77

funcwj/setk

Tools for Speech Enhancement integrated with Kaldi

Language: Python - Size: 36.3 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 410 - Forks: 91

schmiph2/pysepm

Python implementation of performance metrics in Loizou's Speech Enhancement book

Language: Python - Size: 1.85 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 404 - Forks: 88

yxlu-0102/MP-SENet

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

Language: Python - Size: 498 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 386 - Forks: 60

jzi040941/PercepNet

Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

Language: C++ - Size: 31.6 MB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 344 - Forks: 94

shahules786/mayavoz

Pytorch based speech enhancement toolkit.

Language: Python - Size: 1.09 MB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 338 - Forks: 26

yongxuUSTC/sednn

deep learning based speech enhancement using keras or pytorch, make it easy to use

Language: Python - Size: 5.56 MB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 334 - Forks: 126

seanwood/gcc-nmf

Real-time GCC-NMF Blind Speech Separation and Enhancement

Language: Python - Size: 43.2 MB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 319 - Forks: 134

haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement

A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch

Language: Python - Size: 43.9 KB - Last synced at: 7 months ago - Pushed at: almost 5 years ago - Stars: 315 - Forks: 58

aishoot/LSTM_PIT_Speech_Separation

Two-talker Speech Separation with LSTM/BLSTM by Permutation Invariant Training method.

Language: Jupyter Notebook - Size: 7.38 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 308 - Forks: 90

fgnt/pb_bss

Collection of EM algorithms for blind source separation of audio signals

Language: Python - Size: 635 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 286 - Forks: 61

haoheliu/voicefixer_main

General Speech Restoration

Language: Python - Size: 21.5 MB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 278 - Forks: 56

haoxiangsnr/Wave-U-Net-for-Speech-Enhancement

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Language: Python - Size: 511 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 278 - Forks: 64

Xiaobin-Rong/gtcrn

The official implementation of GTCRN, an ultra-lite speech enhancement model.

Language: Python - Size: 10.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 264 - Forks: 48

AkojimaSLP/Beamforming-for-speech-enhancement

simple delaysum, MVDR and CGMM-MVDR

Language: Python - Size: 3.18 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 257 - Forks: 74

jtkim-kaist/Speech-enhancement

Deep neural network based speech enhancement toolkit

Language: MATLAB - Size: 187 MB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 213 - Forks: 62

huyanxin/phasen

A unofficial Pytorch implementation of Microsoft's PHASEN

Language: Python - Size: 2.07 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 198 - Forks: 47

echocatzh/MTFAA-Net

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

Language: Python - Size: 22.5 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 195 - Forks: 56

audiolabs/torch-pesq

PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio

Language: Python - Size: 5.61 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 188 - Forks: 15

sekiguchi92/SoundSourceSeparation

The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.

Language: Python - Size: 31.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 170 - Forks: 30

skirdey/voicerestore

VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration

Language: Python - Size: 4.97 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 160 - Forks: 17

claritychallenge/clarity

Clarity Challenge toolkit - software for building Clarity Challenge systems

Language: Python - Size: 56.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 158 - Forks: 59

KyleZhang1118/Voice-Separation-and-Enhancement

A framework for quick testing and comparing multi-channel speech enhancement and separation methods, such as DSB, MVDR, LCMV, GEVD beamforming and ICA, FastICA, IVA, AuxIVA, OverIVA, ILRMA, FastMNMF.

Language: MATLAB - Size: 35.5 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 157 - Forks: 35

funcwj/aps

A personal toolkit for single/multi-channel speech recognition & enhancement & separation.

Language: Python - Size: 108 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 142 - Forks: 28

james34602/SpleeterRT

Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain.

Language: C - Size: 46 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 138 - Forks: 16

helianvine/fdndlp

A speech dereverberation algorithm, also called wpe

Language: Python - Size: 1.2 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 136 - Forks: 58

madhavmk/Noise2Noise-audio_denoising_without_clean_training_data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise Approach". Paper accepted at the INTERSPEECH 2021 conference. This paper tackles the problem of the heavy dependence of clean speech data required by deep learning based audio denoising methods by showing that it is possible to train deep speech denoising networks using only noisy speech samples.

Language: Jupyter Notebook - Size: 348 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 125 - Forks: 37

tech-podcasts/SpeechEnhancement

基于深度学习的语音增强工具(Speech Enhancement Tools Based on Deep Learning)

Language: Go - Size: 1.35 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 121 - Forks: 21

Audio-WestlakeU/RealMAN

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]

Language: Python - Size: 62.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 119 - Forks: 13

haoxiangsnr/IRM-based-Speech-Enhancement-using-LSTM

Ideal Ratio Mask (IRM) Estimation based Speech Enhancement using LSTM

Language: Python - Size: 1.48 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 116 - Forks: 25

haamoon/mmtm

Implementation of CVPR 2020 paper "MMTM: Multimodal Transfer Module for CNN Fusion"

Language: Python - Size: 47.9 KB - Last synced at: 5 months ago - Pushed at: almost 5 years ago - Stars: 112 - Forks: 21

dansuh17/segan-pytorch

SEGAN pytorch implementation https://arxiv.org/abs/1703.09452

Language: Python - Size: 82 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 108 - Forks: 32

jkjaer/fastF0Nls

C++ and MATLAB code for fast and accurate fundamental frequency estimation

Language: C++ - Size: 5.02 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 100 - Forks: 24

RookieJunChen/Inter-SubNet

The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.

Language: Python - Size: 93.8 KB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 95 - Forks: 12

ConferencingSpeech/ConferencingSpeech2021

Conferencing Speech Challenge

Language: Python - Size: 3 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 94 - Forks: 33

haoheliu/torchsubband

Pytorch implementation of subband decomposition

Language: HTML - Size: 374 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 92 - Forks: 13

line/open-universe

Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.

Language: Python - Size: 6.42 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 91 - Forks: 10

haoxiangsnr/spiking-fullsubnet

Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.

Language: Python - Size: 154 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 90 - Forks: 15

mikeroyal/NLP-Guide

Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.

Language: Python - Size: 315 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

Xiaobin-Rong/deepvqe

An unofficial implementation of DeepVQE proposed by Microsoft Corp.

Language: Python - Size: 164 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 84 - Forks: 21

Speech-Interaction-Technology-Aalto-U/itsp

Introduction to Speech Processing

Language: Jupyter Notebook - Size: 254 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 82 - Forks: 15

vipchengrui/traditional-speech-enhancement

Spectral Subtraction, Wiener Filtering, MMSE

Language: MATLAB - Size: 39.8 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 81 - Forks: 34

Xiaobin-Rong/SEtrain

A training code template for DNN-based speech enhancement.

Language: Python - Size: 48.8 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 76 - Forks: 27

Takaaki-Saeki/ssl_speech_restoration

SelfRemaster: SSL Speech Restoration

Language: Python - Size: 1.34 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 75 - Forks: 6

a-n-rose/Python-Sound-Tool

SoundPy (alpha stage) is a research-based python package for speech and sound. Applications include deep-learning, filtering, speech-enhancement, audio augmentation, feature extraction and visualization, dataset and audio file conversion, and beyond.

Language: Jupyter Notebook - Size: 180 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 74 - Forks: 7

nglehuy/semetrics

Speech Enhancement Metrics (PESQ, CSIG, CBAK, COVL)

Language: MATLAB - Size: 289 KB - Last synced at: 6 days ago - Pushed at: almost 5 years ago - Stars: 74 - Forks: 13

jhauret/eben

Repo for source code of EBEN: Extreme Bandwidth Extension Network

Language: Python - Size: 40.7 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 73 - Forks: 9

Picovoice/koala

On-device noise suppression powered by deep learning

Language: Python - Size: 28.9 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 70 - Forks: 4

Andong-Li-speech/GaGNet

This repo provides the network code and the processed samples of the manuscript "Glance and Gaze: A Collaborative Learning Framework for Single-channel Speech Enhancement", which was accepted by Elsevier Applied Acoustics.

Language: Python - Size: 127 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 67 - Forks: 8

cyrta/awesome-speech-enhancement

A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.

Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 67 - Forks: 15

eesungkim/Speech_Enhancement_MMSE-STSA

A statistical model-based Speech Enhancement Using MMSE-STSA

Language: Python - Size: 1.53 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 66 - Forks: 27

chenwj1989/python-speech-enhancement

a python library for speech enhancement

Language: Python - Size: 1.46 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 65 - Forks: 13

chanil1218/DCUnet.pytorch

Phase-Aware Speech Enhancement with Deep Complex U-Net

Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 64 - Forks: 22

DiegoLeon96/Neural-Speech-Dereverberation

Machine and Deep Learning models for speech dereverberation

Language: Python - Size: 14.2 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 63 - Forks: 15

auspicious3000/deepbeam

Deep learning based Speech Beamforming

Language: Jupyter Notebook - Size: 33.5 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 62 - Forks: 18

01Zhangbw/Speech-and-audio-papers-Top-Conference

It includes papers on speech&audio field. Now update: ICLR2025-2023, ICML2025-2023, NeurIPS2024-2023, ACMMM2024, AAAI2025-2024, ACL2025-2024, EMNLP2024, NAACL2025, IJCAI2024, ECCV2024

Size: 290 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 61 - Forks: 1

seorim0/DCCRN-with-various-loss-functions

DCCRN with various loss functions

Language: Python - Size: 27.6 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 59 - Forks: 17

Audio-WestlakeU/McNet

The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023

Language: Python - Size: 54.3 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 5

Xiaobin-Rong/TRT-SE

An example of a speech enhancement model deployed with TensorRT.

Language: Python - Size: 16.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 50 - Forks: 8

Andong-Li-speech/EaBNet

This is the repo of the manuscript "Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement", which was submitted to ICASSP2022.

Language: Python - Size: 101 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 50 - Forks: 8

supikiti/PNCC

A implementation of Power Normalized Cepstral Coefficients: PNCC

Language: Python - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 47 - Forks: 10

anton-jeran/MULTI-AUDIODEC

This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.

Language: Python - Size: 7.41 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 46 - Forks: 6

archiki/Robust-E2E-ASR

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Language: Python - Size: 141 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 46 - Forks: 10

ghunkins/Voice-Denoising-AN

A Conditional Generative Adverserial Network (cGAN) was adapted for the task of source de-noising of noisy voice auditory images. The base architecture is adapted from Pix2Pix.

Language: Python - Size: 224 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 46 - Forks: 8

jqi41/Pytorch-Tensor-Train-Network

Jun and Huck's PyTorch-Tensor-Train Network Toolbox

Language: Jupyter Notebook - Size: 306 MB - Last synced at: 8 months ago - Pushed at: about 2 years ago - Stars: 44 - Forks: 16

MartinMashalov/VoiceCloning

Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the YourTTS TTS model to clone and generate realistic audio waves

Language: Python - Size: 15.6 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 44 - Forks: 7

Audio-WestlakeU/RVAE-EM

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Language: Python - Size: 55.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 42 - Forks: 4

cogmhear/avse_challenge Fork of claritychallenge/clarity

COG-MHEAR Audio-Visual Speech Enhancement Challenge

Language: Python - Size: 774 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 40 - Forks: 11

winddori2002/MANNER

MANNER: Multi-view Attention Network for Noise ERasure (Speech enhancement in time-domain)

Language: Python - Size: 2.02 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 40 - Forks: 7

haoxiangsnr/SNR-Based-Progressive-Learning-of-Deep-Neural-Network-for-Speech-Enhancement

Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."

Language: Python - Size: 33.2 KB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 38 - Forks: 11

yuzhouhe2000/OMLSA-IMCRA

Python implementation of OMLSA+IMCRA algorithm for speech enhancement.

Language: Python - Size: 8.08 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 16

ZitengWang/nn_mask

multichannel linear filters based on mask estimation neural networks for CHiME4

Language: Python - Size: 3.41 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 37 - Forks: 19

will-rice/denoisers

Simple PyTorch Denoisers for Waveform Audio

Language: Python - Size: 842 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 35 - Forks: 2

LXP-Never/TCNN

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

Language: Python - Size: 398 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 35 - Forks: 5

jerrygood0703/speech-enhancement-WGAN

speech enhancement GAN on waveform/log-power-spectrum data using Improved WGAN

Language: Python - Size: 43.9 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 35 - Forks: 18

jhauret/vibravox

Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.

Language: Python - Size: 2.25 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 34 - Forks: 2

aim-qmul/sdx23-aimless

Source Separation training codebase for the Sound Demixing Challenge 2023.

Language: Python - Size: 342 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 34 - Forks: 0

AkojimaSLP/Frame-by-frame-closed-form-update-for-mask-based-adaptive-MVDR-beamforming

speech-enhacement

Language: Python - Size: 2.41 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 34 - Forks: 15

RusselZHANG/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

This is the microphone array generalization investigation based on previous Narrow Band Deep Filtering methods.

Language: Python - Size: 131 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 33 - Forks: 7

hangtingchen/Beam-Guided-TasNet

Beam-guided TasNet

Language: Python - Size: 23.3 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 33 - Forks: 7

seorim0/DNN-based-Speech-Enhancement-in-the-frequency-domain

DNN-based SE in the frequency domain using Pytorch. You can test some state-of-the-art networks using T-F masking or spectral mapping method.

Language: Python - Size: 857 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 32 - Forks: 10

vipchengrui/MASG

microphone array speech generator (MASG) in room acoustic

Language: Jupyter Notebook - Size: 21.1 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 31 - Forks: 10

Ryuk17/noise-xorcist

Single Channel Speech Enhancement Methods and Toolbox

Language: Python - Size: 56.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 29 - Forks: 8

JaeBinCHA7/Nested-U-Net-based-Real-time-Speech-Enhancement-Mobile-App

Real-time speech enhancement mobile app using Nested U-Net

Language: Python - Size: 31.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 4

linksense/ConvolutionaNeuralNetworksToEnhanceCodedSpeech

In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves speech quality (PESQ) by up to 0.25 MOS-LQO points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec (AMR-WB). In a subjective CCR listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and obtains a clear preference of 1.77 CMOS points compared to G.711, even en par with uncoded speech.

Language: Python - Size: 597 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 27 - Forks: 11

Hguimaraes/SEWUNet

[Research] Monaural Speech Enhancement through Wave-U-Net (SEWUNet)

Language: Python - Size: 14.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 26 - Forks: 4

Related Topics
pytorch 45 deep-learning 44 speech-processing 40 speech 35 speech-recognition 27 speech-separation 24 audio 18 noise-reduction 18 python 16 audio-processing 14 tensorflow 11 source-separation 11 speech-synthesis 11 beamforming 11 deep-neural-networks 11 noise-suppression 11 speech-denoising 10 signal-processing 10 speech-to-text 10 machine-learning 9 denoising 9 denoise 9 asr 9 dereverberation 8 dataset 7 automatic-speech-recognition 7 keras 7 pytorch-lightning 6 unet 6 real-time 6 tts 6 text-to-speech 5 dsp 5 pesq 5 matlab 5 speech-analysis 5 multi-channel 5 speechenhancement 4 voice-conversion 4 generative-adversarial-network 4 dnn 4 chime-challenge 4 nested-unet 4 segan 4 cnn 4 audio-denoising 4 chime-7-udase 4 lightweight 4 python3 4 convolutional-neural-networks 4 wave-u-net 3 dns-challenge 3 end-to-end 3 audio-visual 3 awesome 3 wav 3 machine-translation 3 speaker-diarization 3 speech-translation 3 speech-restoration 3 u2net 3 pytorch-implementation 3 kaldi 3 lstm 3 speaker-recognition 3 bandwidth-extension 3 speaker-verification 3 multichannel-microphone-arrays 3 audio-visual-speech-separation 3 vocoder 2 fft 2 awesome-list 2 audio-enhancement 2 fastapi 2 generative-model 2 onnx 2 resnet 2 pretrained-models 2 codec 2 flask 2 unet-image-segmentation 2 complex-networks 2 deepxi 2 icassp2024 2 deep-xi 2 attention 2 hearing-aids 2 noise-cancellation 2 perceptual-losses 2 datasets 2 mvdr 2 speechrecognition 2 noise 2 mfcc 2 wavenet 2 real-time-processing 2 speaker-identification 2 rnn 2 audio-separation 2 reaper 2