Topic: "mfcc"
ddbourgin/numpy-ml
Machine learning, in numpy
Language: Python - Size: 10 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 16,067 - Forks: 3,801

aubio/aubio
a library for audio and music analysis
Language: C - Size: 11.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,432 - Forks: 391

libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
Language: C - Size: 7.11 MB - Last synced at: 6 months ago - Pushed at: 12 months ago - Stars: 2,796 - Forks: 118

x4nth055/emotion-recognition-using-speech
Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
Language: Python - Size: 944 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 621 - Forks: 242

ar1st0crat/NWaves
.NET DSP library with a lot of audio processing functions
Language: C# - Size: 7.28 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 487 - Forks: 77

SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
Language: Python - Size: 20.7 MB - Last synced at: 8 days ago - Pushed at: about 2 months ago - Stars: 471 - Forks: 79

adamstark/Gist
A C++ Library for Audio Analysis
Language: C++ - Size: 938 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 378 - Forks: 76

sp-nitech/SPTK
A suite of speech signal processing tools
Language: C++ - Size: 5.57 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 232 - Forks: 27

jsingh811/pyAudioProcessing
Audio feature extraction and classification
Language: Python - Size: 22.9 MB - Last synced at: 2 days ago - Pushed at: almost 2 years ago - Stars: 225 - Forks: 39

gionanide/Speech_Signal_Processing_and_Classification
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
Language: Python - Size: 827 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 220 - Forks: 62

SuperKogito/Voice-based-gender-recognition
:sound: :boy: :girl:Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)
Language: Python - Size: 8.96 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 213 - Forks: 68

csukuangfj/kaldifeat
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
Language: C++ - Size: 10.3 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 197 - Forks: 38

sp-nitech/diffsptk
A differentiable version of SPTK
Language: Python - Size: 1.65 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 182 - Forks: 16

SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
Language: C - Size: 63.6 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 169 - Forks: 48

tympanix/subsync
Synchronize your subtitles using machine learning
Language: Python - Size: 468 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 153 - Forks: 16

ewan-xu/LibrosaCpp
LibrosaCpp is a c++ implemention of librosa to compute short-time fourier transform coefficients,mel spectrogram or mfcc
Language: C++ - Size: 2.55 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 139 - Forks: 39

amanbasu/speech-emotion-recognition
Detecting emotions using MFCC features of human speech using Deep Learning
Language: Jupyter Notebook - Size: 2.53 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 118 - Forks: 40

ZhuoZhuoCrayon/AcousticKeyBoard-Web
声学键盘|❓脑洞大开:做一个能听懂键盘敲击键位的「玩具」,学习信号处理 / 深度学习 / 安卓 / Django。
Language: Python - Size: 68.3 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 86 - Forks: 5

GauravWaghmare/Speaker-Identification
A program for automatic speaker identification using deep learning techniques.
Language: Python - Size: 436 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 83 - Forks: 29

MycroftAI/sonopy
A simple audio feature extraction library
Language: Python - Size: 8.79 KB - Last synced at: 21 days ago - Pushed at: almost 6 years ago - Stars: 79 - Forks: 21

mathquis/node-personal-wakeword
Personal wake word detector
Language: JavaScript - Size: 103 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 63 - Forks: 8

ZitengWang/python_kaldi_features
python codes to extract MFCC and FBANK speech features for Kaldi
Language: Python - Size: 79.1 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 60 - Forks: 16

georgid/AlignmentDuration
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
Language: Python - Size: 342 MB - Last synced at: 12 months ago - Pushed at: about 5 years ago - Stars: 55 - Forks: 6

SuperKogito/Voice-based-speaker-identification
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
Language: Python - Size: 105 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 54 - Forks: 15

stefantaubert/mel-cepstral-distance
A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".
Language: Python - Size: 59.8 MB - Last synced at: 3 days ago - Pushed at: 10 days ago - Stars: 53 - Forks: 10

aubio/vamp-aubio-plugins
aubio plugins for Vamp
Language: C++ - Size: 440 KB - Last synced at: 27 days ago - Pushed at: over 7 years ago - Stars: 48 - Forks: 12

zafarrafii/Zaf-Python
Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 116 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 47 - Forks: 11

supikiti/PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC
Language: Python - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 47 - Forks: 10

zafarrafii/Zaf-Matlab
Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 86 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 14

mechanicalsea/spectra
Spectra extraction tutorials based on torch and torchaudio.
Language: Jupyter Notebook - Size: 3.31 MB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 40 - Forks: 4

sheelabhadra/Emergency-Vehicle-Detection
Python implementation of papers on emergency vehicle detection using audio signals
Language: Jupyter Notebook - Size: 7.78 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 39 - Forks: 13

pulakk/Live-Audio-MFCC
Live Audio MFCC Visualization in the browser using Web Audio API - https://pulakk.github.io/Live-Audio-MFCC/tutorial
Language: JavaScript - Size: 928 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 34 - Forks: 6

FragJage/SpeakerVoiceIdentifier
SpeakerVoiceIdentifier can recognize the voice of a speaker by learning.
Language: C++ - Size: 20.3 MB - Last synced at: about 2 months ago - Pushed at: about 8 years ago - Stars: 33 - Forks: 14

skaws2003/pytorch-mfcc
A pytorch implementation of MFCC.
Language: Python - Size: 65.4 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 31 - Forks: 1

k-farruh/speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
Language: Python - Size: 21.7 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 31 - Forks: 8

alicex2020/Mandarin-Tone-Classification
Deep learning using CNN for Mandarin Chinese tone classification
Language: Jupyter Notebook - Size: 489 KB - Last synced at: 9 months ago - Pushed at: about 6 years ago - Stars: 31 - Forks: 7

alicex2020/Deep-Learning-Lie-Detection
Use machine learning models to detect lies based solely on acoustic speech information
Language: Jupyter Notebook - Size: 837 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 30 - Forks: 10

dydtjr1128/Speaker-Recognition-using-NN
Speaker Recognition using Neural Network & Linear Regression
Language: Jupyter Notebook - Size: 46.8 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 29 - Forks: 7

zhengyima/DTW_Digital_Voice_Recognition
基于DTW与MFCC特征进行数字0-9的语音识别,DTW,MFCC,语音识别,中英数据,端点检测,Digital Voice Recognition。
Language: Python - Size: 6.26 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 28 - Forks: 4

GuitarsAI/BasicsMusicalInstrumClassifi
Basics of Musical Instruments Classification using Machine Learning
Language: Jupyter Notebook - Size: 13.8 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 28 - Forks: 12

linksense/ConvolutionaNeuralNetworksToEnhanceCodedSpeech
In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves speech quality (PESQ) by up to 0.25 MOS-LQO points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec (AMR-WB). In a subjective CCR listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and obtains a clear preference of 1.77 CMOS points compared to G.711, even en par with uncoded speech.
Language: Python - Size: 597 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 27 - Forks: 11

FragIt/fragit-main
FragIt main repository
Language: Python - Size: 529 KB - Last synced at: 27 days ago - Pushed at: 29 days ago - Stars: 26 - Forks: 12

nipunmanral/Spoken-Language-Identification
Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features
Language: Python - Size: 6.84 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 25 - Forks: 16

DataXujing/ASR-paper
:fire: ASR教程: https://dataxujing.github.io/ASR-paper/
Size: 1.07 GB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 24 - Forks: 6

IhabBendidi/Voice-authentification-API
A RESTFUL API implementation of an authentification system using voice fingerprint
Language: Python - Size: 5.97 MB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 24 - Forks: 2

Abhay0899193/Speaker-Recognition
Speaker Recognition System using MFCC and GMM.
Language: Python - Size: 7.98 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 24 - Forks: 12

zafarrafii/CQHC-Python
Constant-Q harmonic coefficients (CQHCs), a timbre feature designed for music signals.
Language: Jupyter Notebook - Size: 84.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 23 - Forks: 1

ringabout/scim
[wip]Speech recognition tool-box written by Nim. Based on Arraymancer.
Language: Nim - Size: 354 KB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 23 - Forks: 0

geekysethi/audio_classification
Language: Python - Size: 1010 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 21 - Forks: 4

JavierAntoran/tiger-costume-voice-conversion
Voice Alignment and Conversion with Neural Networks and the WORLD codec.
Language: Jupyter Notebook - Size: 63.5 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 20 - Forks: 1

dhruvesh13/Audio-Genre-Classification
Automatic music genre classification using Machine Learning algorithms like- Logistic Regression and K-Nearest Neighbours
Language: Python - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 19 - Forks: 11

zhengyima/GMM_Digital_Voice_Recognition
基于GMM与MFCC特征进行数字0-9的语音识别,GMM,MFCC,语音识别,中文数据,sklearn,Digital Voice Recognition。
Language: Python - Size: 532 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 2

orbxball/timit-preprocessor
Extract mfcc vectors and phones from TIMIT dataset
Language: Shell - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 15 - Forks: 0

LexicalStressDetection/lexical-stress-detection
Deep Learning model for lexical stress detection in spoken English
Language: Python - Size: 2.43 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 15 - Forks: 2

ShoYamanishi/AndroidMFCC
26-Point MFCC & 512-Point FFT Generator & Visualizer in Java, C++, and NEON intrinsics
Language: C++ - Size: 6.02 MB - Last synced at: 26 days ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 2

kleinzcy/speech_signal_processing
Language: Python - Size: 15.5 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 15 - Forks: 2

HassanHayat08/Interpretable-CNN-for-Big-Five-Personality-Traits-using-Audio-Data
We developed an interpretable CNN for big five personality traits using human speech data. This project discovers the different frequency patterns of a human voice with respect to each five personality traits. This project will help us to understand the apparent personality of a human using his/her voice.
Language: Python - Size: 16.6 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 14 - Forks: 3

zhengyima/HMM_Digital_Voice_Recognition
基于HMM与MFCC特征进行数字0-9的语音识别,HMM,GMMHMM,MFCC,语音识别,sklearn,Digital Voice Recognition。
Language: Python - Size: 764 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 4

amitchone/ASR
A Python 2.7 implementation of Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) algorithms for Automated Speech Recognition (ASR).
Language: Python - Size: 13.6 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 13 - Forks: 4

baggepinnen/LPVSpectral.jl
Least-squares (sparse) spectral estimation and (sparse) LPV spectral decomposition.
Language: Julia - Size: 424 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 6

anicolson/matlab_feat
Functions for creating speech features in MATLAB.
Language: MATLAB - Size: 39.1 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 12 - Forks: 5

yashbhalgat/Emotion-from-speech-MFCC
Maltab code for extraction of Mel Frequency Cepstral Coefficients
Language: Matlab - Size: 279 KB - Last synced at: 5 months ago - Pushed at: about 9 years ago - Stars: 12 - Forks: 8

lucko515/Speech-commands-recognition
Recognizing common speech commands using Keras and Tensorflow.
Language: Python - Size: 11 MB - Last synced at: 26 days ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 3

NeuroByte-Consulting/Speech-Emotion-Recognition-in-Tensorflow-Using-CNNs
Speech Emotion Recognition (SER) in Tensorflow using CNNs and CRNNs Based on Mel Spectrograms and Mel Frequency Cepstral Coefficients (MFCCs)
Language: Jupyter Notebook - Size: 15.6 MB - Last synced at: 21 days ago - Pushed at: 22 days ago - Stars: 9 - Forks: 0

pdadial/Speech_Emotion_Recognition_CNN-LSTM
CNN-LSTM based SER model using RAVDESS database
Language: Jupyter Notebook - Size: 202 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 2

sarthak268/Audio-Classification-using-MFCC-and-Spectrogram
Audio classification using a simple SVM classifier making use of MFCC and Spectrogram features coded from scratch
Language: Python - Size: 240 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 1

RBGTOP/Music-Genre-Recognition
Music genre classification using deep learning
Size: 1.95 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8 - Forks: 0

acen20/cnn-tf-keras-audio-classification
Feature extraction from sound signals along with complete CNN model and evaluations using tensorflow, keras and, librosa for MFCC generation
Language: Jupyter Notebook - Size: 5.34 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 4

grimmdaniel/personality-trait-prediction
Big Five personality trait prediction on First Impressions V2 dataset
Language: Jupyter Notebook - Size: 729 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 7

sahilsharma884/Music-Genre-Classification
Perform three types of feature extraction: STFT, MFCC and MelSpectrogram. Apply CNN/VGG with or without RNN architecture. Able to achieve 95% accuracy.
Language: Python - Size: 4.53 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 3

Ralireza/spoken-digit-recognition
Classifying English spoken digit by Hidden Markov Model
Language: Python - Size: 6.92 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 8 - Forks: 3

miselaytes-anton/whospeaks
Speaker recognition using Mel Frequency Cepstral Coefficients (MFCC) and Linde-Buzo-Gray (LBG) clustering algorithm
Language: JavaScript - Size: 28.1 MB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 8 - Forks: 3

lincolnhard/sound-feature-extraction-C
Useful feature extraction for next step classification
Language: C - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: over 7 years ago - Stars: 8 - Forks: 7

msaintfelix/TensorFlow_MusicGenre_Classifier
Classifying wav files with their MFCC
Language: Jupyter Notebook - Size: 608 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 3

FedericaPaoli1/stm32-speech-recognition-and-traduction
stm32-speech-recognition-and-traduction is a project developed for the Advances in Operating Systems exam at the University of Milan (academic year 2020-2021). It implements a speech recognition and speech-to-text translation system using a pre-trained machine learning model running on the stm32f407vg microcontroller.
Language: C - Size: 41.7 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

DenaJGibbon/MFCC-Vocal-Fingerprinting
Code to do MFCC feature extraction on gibbon calls and use LDA/SVM for classification
Language: R - Size: 9.28 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 0

harshitkgupta/Classification-of-Autism-Spectrum-Disorder
Machine Leaning Approaches for Classification of Children with Autism Spectrum Disorder
Language: Jupyter Notebook - Size: 283 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 2

ragibson/MFCC-speech-recognition
Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.
Language: Jupyter Notebook - Size: 1.05 MB - Last synced at: 22 days ago - Pushed at: almost 6 years ago - Stars: 7 - Forks: 0

PranavPutsa1006/Speaker-Diarization
Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python
Language: Jupyter Notebook - Size: 20.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

Sangramsingkayte/Audio-Feature-Extraction
In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.
Language: Jupyter Notebook - Size: 1.24 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 6

Guan-JW/GMM-Isolated-Speech-Recognition
基于MFCC特征构建单核GMM的0-9独立词语音识别,MFCC,GMM,sklearn,Isolated word recognition。
Language: Python - Size: 8.75 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 1

jefflai108/LSTM
Voice activity detection of noisy speech files with LSTM. LSTM is implemented with Keras. Data processing is done with Python, MATLAB, and Bash. Experiments are done on Johns Hopkins CLSP GPUs.
Language: Python - Size: 14.7 MB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 6 - Forks: 4

piruty/voice_actor_recog
Extract MFCC from movie files and detect speaker using it
Language: Python - Size: 372 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 0

justanotherinternetguy/XSpeech
XSpeech: A Novel Deep Learning Approach to Classifying Stutters
Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

zafarrafii/Zaf-Julia
Zafar's Audio Functions in Julia for audio signal analysis: STFT, inverse STFT, CQT kernel, CQT spectrogram, CQT chromagram, MFCC, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 60.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

bumblebee26/Gender-Recognition-System
Gender recognition based on human speech signals.
Language: Python - Size: 21.5 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 2

JavierAntoran/moby_dick_whale_audio_detection
Feature extraction, HMMs, Neural Nets, and Boosting for Kaggle Cornell Whale detection challenge.
Language: Jupyter Notebook - Size: 36.1 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 5 - Forks: 1

nnarenraju/sound-classification
Classification of Sounds Using Convolutional Neural Networks
Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 5 - Forks: 4

8g6-new/c_spectrogram
A high performance spectrogram with STFT Mel and MFCC support in pure C
Language: C - Size: 190 MB - Last synced at: 20 days ago - Pushed at: 21 days ago - Stars: 4 - Forks: 0

FandosA/Singer_Recognition_Keras_TF
This project was my final Bachelor's degree thesis. In it I decided to mix my passion, music, and the syllabus that I liked the most in my degree, deep learning.
Language: Python - Size: 3.26 GB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

vasudev-sharma/Music-Genre-Classificattion-Using-Deep-Learning-
Language: Python - Size: 13.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

Rumeysakeskin/dtw-compare-audio-files
Compute the MFCCs and measure (dis)similarity between two audio files using DTW
Language: Python - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

HanSeokhyeon/Speech_recognition_for_English_and_Korean
다양한 feature를 이용한 음성인식 LAS model입니다. (한국어는 개발예정)
Language: Python - Size: 301 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

msnmkh/Spoken-Digit-Recognition
Recognize English spoken digits using Hidden Markov Model
Language: Python - Size: 11.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 6

vshantam/Gender-classification-Artificial-Neural-Nets
This classifiers the gender of the person speaking in the singular audio file using Artificial Neural Networks
Language: Python - Size: 319 MB - Last synced at: 9 months ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 2

oowais/Muses
Audio Comparison system for comparing mp3/wav audio using mfcc, rhythm and other features
Language: Python - Size: 30 MB - Last synced at: 10 days ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 4

wildanka/ASRBP
Speech Recognition experiment using MFCC Feature Extraction + Feed Forward Neural Network (training with Backpropagation)
Language: Java - Size: 104 KB - Last synced at: 7 months ago - Pushed at: almost 8 years ago - Stars: 4 - Forks: 2

certainlyWrong/mfcc_bee
Implementação do algoritmo de extração de características em dart.
Language: Dart - Size: 332 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

Ratndeepk/emotion_detection_in_audio
Speech Emotion Recognition (SER) - Project aim to recognize emotion through speech. Databases used - Ravdess,Tess & Savee. Feature Extractor used - MFCC, STFT & Chroma, with three different extractor able to recognize emotion with high accuracy.
Language: Jupyter Notebook - Size: 4.77 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 2

brucewlee/LAMA-Music-Genre-Dataset
.wav files, training dataset (MFCC), and graph plots (FFTs, MFCCs, Waveforms) from Latin America, Asia, MiddleEast, and Africa
Language: Python - Size: 23.8 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0
