Topic: "mel-spectrogram"
Sharad24/Neural-Voice-Cloning-with-Few-Samples 📦
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Language: Python - Size: 57.7 MB - Last synced at: 10 months ago - Pushed at: about 4 years ago - Stars: 252 - Forks: 55

tiberiu44/TTS-Cube
End-2-end speech synthesis with recurrent neural networks
Language: Python - Size: 753 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 226 - Forks: 45

BShakhovsky/PolyphonicPianoTranscription
Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
Language: Jupyter Notebook - Size: 7.06 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 195 - Forks: 40

Data-Science-kosta/Speech-Emotion-Classification-with-PyTorch
This repository contains PyTorch implementation of 4 different models for classification of emotions of the speech.
Language: Jupyter Notebook - Size: 6.79 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 152 - Forks: 35

spotify/realbook
Easier audio-based machine learning with TensorFlow.
Language: Python - Size: 83 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 120 - Forks: 7

CVxTz/audio_classification
CNN 1D vs 2D audio classification
Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 95 - Forks: 25

MycroftAI/sonopy
A simple audio feature extraction library
Language: Python - Size: 8.79 KB - Last synced at: about 22 hours ago - Pushed at: almost 6 years ago - Stars: 79 - Forks: 21

echocatzh/torch-mfcc
A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.
Language: Python - Size: 39.1 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 74 - Forks: 11

zzw922cn/LPC_for_TTS
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.
Language: Python - Size: 652 KB - Last synced at: 17 days ago - Pushed at: about 4 years ago - Stars: 68 - Forks: 10

rednafi/urban-sound-classification
Urban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)
Language: Jupyter Notebook - Size: 36.9 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 60 - Forks: 15

zafarrafii/Zaf-Python
Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 116 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 47 - Forks: 11

zafarrafii/Zaf-Matlab
Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 86 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 14

skanderhamdi/attention_cnn_lstm_covid_mel_spectrogram
Attention-based Hybrid CNN-LSTM and Spectral Data Augmentation for COVID-19 Diagnosis from Cough Sound
Language: Python - Size: 49.8 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 24 - Forks: 4

yoyolicoris/wavenet-like-vocoder
Basic wavenet and fftnet vocoder model.
Language: Python - Size: 46.9 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 19 - Forks: 2

adasegroup/OSM-one-shot-multispeaker
Framework for one-shot multispeaker system based on Deep Learning
Language: Python - Size: 46 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 19 - Forks: 4

monetjoe/pianos
This study converts piano recordings to mel spectrogram and classifies them by SOTA pre-trained neural network backbones in CV. Comparative experiments show that SqueezeNet achieves a best classification accuracy of 92.37%.|该项目将钢琴录音转为为mel频谱图,使用微调后的前沿计算机视觉领域预训练深度学习骨干网络对其进行分类,对比实验可知SqueezeNet作为最优网络正确率可达92.37%
Language: Python - Size: 292 KB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 16 - Forks: 0

baggepinnen/LPVSpectral.jl
Least-squares (sparse) spectral estimation and (sparse) LPV spectral decomposition.
Language: Julia - Size: 424 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 12 - Forks: 6

Friedrich-M/Audio-signal-classification-and-identification
基于梅尔频谱的信号分类和识别
Language: Python - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 0

KanikeSaiPrakash/Speech-Emotion-Recognition
Speech Emotion Recognition using Deep Learning
Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: 8 months ago - Pushed at: almost 4 years ago - Stars: 11 - Forks: 2

ricardokleinklein/deepMultiSpeech
Deep Multi-Speech model
Language: Python - Size: 281 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 5

ddman1101/EDM-subgenre-classifier
Code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features" arXiv:2110.08862, 2021.
Language: Python - Size: 109 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

NeuroByte-Consulting/Speech-Emotion-Recognition-in-Tensorflow-Using-CNNs
Speech Emotion Recognition (SER) in Tensorflow using CNNs and CRNNs Based on Mel Spectrograms and Mel Frequency Cepstral Coefficients (MFCCs)
Language: Jupyter Notebook - Size: 15.6 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 9 - Forks: 0

Keerthiraj-Nagaraj/cough-detection-with-transfer-learning
Cough detection with Log Mel Spectrogram, Wavelet Transform, Deep learning and Transfer learning techniques
Language: Python - Size: 778 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 3

RBGTOP/Music-Genre-Recognition
Music genre classification using deep learning
Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 8 - Forks: 0

goepfert/audio_features
Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js
Language: JavaScript - Size: 197 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

mikex86/SonopyJava
Java Implementation of the Sonopy Audio Feature Extraction Library by MycroftAI
Language: Java - Size: 78.1 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 1

cschen1205/cs-mel-spectrogram
Convert audio file to melgram (that is, mel-spectrogram) in .NET
Language: C# - Size: 73.7 MB - Last synced at: 16 days ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 3

zafarrafii/Zaf-Julia
Zafar's Audio Functions in Julia for audio signal analysis: STFT, inverse STFT, CQT kernel, CQT spectrogram, CQT chromagram, MFCC, DCT, DST, MDCT, inverse MDCT.
Language: Jupyter Notebook - Size: 60.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

8g6-new/c_spectrogram
A high performance spectrogram with STFT Mel and MFCC support in pure C
Language: C - Size: 190 MB - Last synced at: about 5 hours ago - Pushed at: about 15 hours ago - Stars: 4 - Forks: 0

mariamkhmahran/gunshot-detection-system
This repository contains the Python code for a audio classification system designed to detect gunshots in urban settings.
Language: Jupyter Notebook - Size: 572 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

renesemela/masters-thesis-music-autotagging
Master's Thesis: Automatic Tagging of Musical Compositions Using Machine Learning Methods
Language: Python - Size: 545 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 3

Rumeysakeskin/dtw-compare-audio-files
Compute the MFCCs and measure (dis)similarity between two audio files using DTW
Language: Python - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

ajayKumar99/Music-Genre-Classification
A tensorflow application of CNN based music genre classifier which classifies an audio clip based on it's Mel Spectrogram and a RestAPI for inference using tensorflow serving
Language: Jupyter Notebook - Size: 5.94 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 0

andyharless/paces
Music Pace Compatibility Project
Language: Jupyter Notebook - Size: 35.7 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

filipmu/Kaggle-freesound-audio-tagging-2019
My best submission to this Kaggle contest
Language: Jupyter Notebook - Size: 1.43 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

riccardomalpiedi/musicgenreclassification
Music genre classification with CNN (exam project)
Language: Jupyter Notebook - Size: 1.19 GB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

DanielMPMatCom/Identifying-Colines.-JCE-MatCom
Research on the structure of the song choruses of the frog species Eleutherodactylus eileenae. Obtaining song sequences.
Language: Jupyter Notebook - Size: 23.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

SimpleKidd/Fault-Diagnosis-of-a-Rotor-Bearing-System-using-ML
Analyzing Vibrational Data of the System using Machine Learning
Language: Jupyter Notebook - Size: 7.69 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

to-schi/ASR-Deepspeech2-Tensorflow
An end-to-end speech recognition engine similar to DeepSpeech2
Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

poludmik/Music-Genre-Classification
A pet project on music genre classification. Assigning the correct genre to the provided audio track.
Language: Python - Size: 5.41 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

sh3r4zhassan/Sound-Prediction-and-Cancellation-Model
This Model analyzes and predicts the input sound and then using pretrained ANC systems cancels the input sound.
Language: Jupyter Notebook - Size: 409 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 2

bayuwira/Kendang-Tunggal-Classification-Using-Backpropagation-and-Onset-Detection
Bali has a diversity of arts that has been recognized by the world, where one of the most famous Balinese arts is the Karawitan art, especially the Kendang Tunggal instrument. Notation documentation or more commonly known as music transcription, can make learning a song easier, and in the case of this research, it makes it easier to learn to play the Kendang Tunggal instrument. The first approach method used to document a kendang tunggal song is onset detection. Onset is when the signal experiences an attack period, which helps segment the sound color of the drum instrument. The segmented kendang tunggal sound color classification uses the Backpropagation algorithm with several features of the frequency domain and time domain as a characteristic of the sound color. Then the kendang tunggal song is revived into a synthetic sound with the Mel Spectral Approximation filter. Based on the research, the optimal parameter for drum sound color segmentation with onset detection is the hop size 110 with normalization of the features on its onset detection function. The optimal backpropagation architecture obtained with a learning rate of 0.9, neurons 10, and epoch 2000 produces an accuracy of 60.85%. The synthesis method using the Mel Log Spectrum Approximation can make synthetic sounds similar to kendang songs with an accuracy of 83.33%
Language: Python - Size: 678 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

Marini97/Audio-CNN
Project to classify wav audio files using a CNN.
Language: Jupyter Notebook - Size: 1.76 GB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Saurabh620/Voice-Signal-Processing-using-Python-GUI
In this project we used TESS voice dataset and processed it and perform emotion prediction.
Language: Jupyter Notebook - Size: 2.99 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

reece-iriye/Saxophone-Hero
Step onto the stage with Saxophone Hero, where your tenor saxophone is the key to unlocking a rhythmic adventure through a world of sheet music. In this game, your character scores points by hitting the right notes. Powered by machine learning, the game captures the pitch from your saxophone and translates it to player movement in real time.
Language: Jupyter Notebook - Size: 70.7 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mtxslv/autoencoder-chiptune
Embed chiptunes in 2D with Convolutional Auto Encoder and Mel Spectrograms
Language: Jupyter Notebook - Size: 1.07 GB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

anirudhs123/Music-Instrument-Classification
In this project we use a Lightweight-CNN based model to classify instruments from the Freesound audio data set. We make use of Mel-Spectrogram features from the input audio data as the input to the CNN model. To add robustness to the model, we use a novel data augmentation technique based on the Cut-Mix algorithm.
Language: Jupyter Notebook - Size: 2.47 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

chandukasturi/Music-Genre-Classification
Implemented multiple classification models using Python with GTZAN data set by leading a team of three people. Compared the performances of K-NN, SVM, CNN models and logged their results in terms of prediction accuracies. A Convolutional Neural Network model stood out with the highest prediction accuracy of 82% amongst all other models.
Language: Jupyter Notebook - Size: 570 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

COINS-SS21/moody-ser
Speech emotion recognition models for the Moody web application.
Language: Jupyter Notebook - Size: 144 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

dangpzanco/dcase-task1
Acoustic Scene Classification System (DCASE2018 Task 1)
Language: Python - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 3

Nickaine1/Music-Genre-Recognition
Music-genre-classification-using-deep-learning
Size: 3.91 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

idaishe/Music-Genre-Recognition
Music-genre-classification-using-deep-learning
Size: 2.93 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

LHPT2009/Music-Genre-Recognition
Music genre classification using deep learning
Size: 0 Bytes - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

dxspeeder/Music-Genre-Recognition
Music genre classification using deep learning
Size: 5.86 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

DevinWSoTuff/Music-Genre-Recognition
Music genre classification using deep learning
Size: 5.86 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

pavlosdais/Music-Genre-Recognition
Music genre classification using deep learning
Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mradovic38/dtw-speech-recognition
Speech recognition system that uses feature extraction and dynamic time warping (DTW) to identify words and to find the most similar speaker.
Language: Python - Size: 29.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

NajdBinrabah/Deep-Learning-with-TensorFlow-and-Keras
This project explores emotion recognition in audio data, focusing on feature extraction techniques while also comparing the performance of LSTM and 1D CNN models.
Language: Jupyter Notebook - Size: 855 KB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

neurlang/gomel
Golang Mel Spectrogram and Spectrogram inversion
Language: Go - Size: 43.9 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

UofTNeurology/masa-open-source
Open Source Repository for the MASA Project
Language: Jupyter Notebook - Size: 758 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aaolcay/Some-Signal-Processing-Stuff
Different Signal Processing Tasks
Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Balajirvp/Dynamic-Time-Warping
Leveraged Dynamic Time Warping (DTW) to assess the similarity between specific audio tracks
Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

BScUniversityCollaborations/automatic-speech-recognition
Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.
Language: Python - Size: 8.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

imane-ayouni/Emotional-classification-through-voice-using-Backpropagation
Simple neural net to classify the emotion in an audio
Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Themiscodes/Music-Classification-Melgrams
Music genre recognition with Convolutional Neural Networks (CNN) using Mel Spectrograms
Language: Jupyter Notebook - Size: 450 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

papari1123/MAIC_VOICE_AI_Challenge_2021
MAIC VOICE AI 대회. 음성 멜-스펙트럼 데이터를 이용한 음성 질환 진단 및 분류.
Size: 3.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Karthick47v2/mock-buddy-audio-server
audio processing service for mock-buddy
Language: PureBasic - Size: 477 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

dvd125/Classification-of-musical-genres-and-music-retrival
Language: Jupyter Notebook - Size: 8.55 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nouranHisham/speech_emotion_recognition
This project was for the pattern recognition course I studied in college. This was the beginning of dealing with neural networks and 2 CNN models were made, 1-d model and 2-d model to deal with different forms of the data, audio and image, respectively.
Language: Jupyter Notebook - Size: 3.7 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

shinshoji01/AM_with_GAN_for_melspectrogram
This repository is to introduce the application of Activation Maximization for audio-domain data.
Language: Jupyter Notebook - Size: 25.3 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

purang2/MUSIC-TO-SENTIMENT
2021-1 뇌인지공학 Term Project [👀🤜ing~ 06/24]
Language: Jupyter Notebook - Size: 1.23 GB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

hcy71o/Speech_Preprocessing
Overall process of speech signal processing (Mel-spectrogram & MFCCs) and loading data using Pytorch dataloader
Language: Jupyter Notebook - Size: 3.53 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

vimlord/torchrosa-tts
A text-to-speech program using VAE on Mel spectrograms of phonemes.
Language: Python - Size: 877 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

OldBonhart/dsp_for_dl
Introduction to Digital Signal Processing for Machine Learning
Language: Jupyter Notebook - Size: 24.4 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0
