An open API service providing repository metadata for many open source software ecosystems.

Topic: "mel-spectrogram"

Sharad24/Neural-Voice-Cloning-with-Few-Samples 📦

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Language: Python - Size: 57.7 MB - Last synced at: 10 months ago - Pushed at: about 4 years ago - Stars: 252 - Forks: 55

tiberiu44/TTS-Cube

End-2-end speech synthesis with recurrent neural networks

Language: Python - Size: 753 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 226 - Forks: 45

BShakhovsky/PolyphonicPianoTranscription

Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)

Language: Jupyter Notebook - Size: 7.06 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 195 - Forks: 40

Data-Science-kosta/Speech-Emotion-Classification-with-PyTorch

This repository contains PyTorch implementation of 4 different models for classification of emotions of the speech.

Language: Jupyter Notebook - Size: 6.79 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 152 - Forks: 35

spotify/realbook

Easier audio-based machine learning with TensorFlow.

Language: Python - Size: 83 KB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 120 - Forks: 7

CVxTz/audio_classification

CNN 1D vs 2D audio classification

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 95 - Forks: 25

MycroftAI/sonopy

A simple audio feature extraction library

Language: Python - Size: 8.79 KB - Last synced at: about 22 hours ago - Pushed at: almost 6 years ago - Stars: 79 - Forks: 21

echocatzh/torch-mfcc

A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.

Language: Python - Size: 39.1 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 74 - Forks: 11

zzw922cn/LPC_for_TTS

Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.

Language: Python - Size: 652 KB - Last synced at: 17 days ago - Pushed at: about 4 years ago - Stars: 68 - Forks: 10

rednafi/urban-sound-classification

Urban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)

Language: Jupyter Notebook - Size: 36.9 MB - Last synced at: 19 days ago - Pushed at: about 2 years ago - Stars: 60 - Forks: 15

zafarrafii/Zaf-Python

Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.

Language: Jupyter Notebook - Size: 116 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 47 - Forks: 11

zafarrafii/Zaf-Matlab

Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.

Language: Jupyter Notebook - Size: 86 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 14

skanderhamdi/attention_cnn_lstm_covid_mel_spectrogram

Attention-based Hybrid CNN-LSTM and Spectral Data Augmentation for COVID-19 Diagnosis from Cough Sound

Language: Python - Size: 49.8 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 24 - Forks: 4

yoyolicoris/wavenet-like-vocoder

Basic wavenet and fftnet vocoder model.

Language: Python - Size: 46.9 KB - Last synced at: 5 days ago - Pushed at: about 3 years ago - Stars: 19 - Forks: 2

adasegroup/OSM-one-shot-multispeaker

Framework for one-shot multispeaker system based on Deep Learning

Language: Python - Size: 46 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 19 - Forks: 4

monetjoe/pianos

This study converts piano recordings to mel spectrogram and classifies them by SOTA pre-trained neural network backbones in CV. Comparative experiments show that SqueezeNet achieves a best classification accuracy of 92.37%.|该项目将钢琴录音转为为mel频谱图,使用微调后的前沿计算机视觉领域预训练深度学习骨干网络对其进行分类,对比实验可知SqueezeNet作为最优网络正确率可达92.37%

Language: Python - Size: 292 KB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 16 - Forks: 0

baggepinnen/LPVSpectral.jl

Least-squares (sparse) spectral estimation and (sparse) LPV spectral decomposition.

Language: Julia - Size: 424 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 12 - Forks: 6

Friedrich-M/Audio-signal-classification-and-identification

基于梅尔频谱的信号分类和识别

Language: Python - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 0

KanikeSaiPrakash/Speech-Emotion-Recognition

Speech Emotion Recognition using Deep Learning

Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: 8 months ago - Pushed at: almost 4 years ago - Stars: 11 - Forks: 2

ricardokleinklein/deepMultiSpeech

Deep Multi-Speech model

Language: Python - Size: 281 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 11 - Forks: 5

ddman1101/EDM-subgenre-classifier

Code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features" arXiv:2110.08862, 2021.

Language: Python - Size: 109 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

NeuroByte-Consulting/Speech-Emotion-Recognition-in-Tensorflow-Using-CNNs

Speech Emotion Recognition (SER) in Tensorflow using CNNs and CRNNs Based on Mel Spectrograms and Mel Frequency Cepstral Coefficients (MFCCs)

Language: Jupyter Notebook - Size: 15.6 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 9 - Forks: 0

Keerthiraj-Nagaraj/cough-detection-with-transfer-learning

Cough detection with Log Mel Spectrogram, Wavelet Transform, Deep learning and Transfer learning techniques

Language: Python - Size: 778 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 3

RBGTOP/Music-Genre-Recognition

Music genre classification using deep learning

Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 8 - Forks: 0

goepfert/audio_features

Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js

Language: JavaScript - Size: 197 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

mikex86/SonopyJava

Java Implementation of the Sonopy Audio Feature Extraction Library by MycroftAI

Language: Java - Size: 78.1 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 1

cschen1205/cs-mel-spectrogram

Convert audio file to melgram (that is, mel-spectrogram) in .NET

Language: C# - Size: 73.7 MB - Last synced at: 16 days ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 3

zafarrafii/Zaf-Julia

Zafar's Audio Functions in Julia for audio signal analysis: STFT, inverse STFT, CQT kernel, CQT spectrogram, CQT chromagram, MFCC, DCT, DST, MDCT, inverse MDCT.

Language: Jupyter Notebook - Size: 60.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

8g6-new/c_spectrogram

A high performance spectrogram with STFT Mel and MFCC support in pure C

Language: C - Size: 190 MB - Last synced at: about 5 hours ago - Pushed at: about 15 hours ago - Stars: 4 - Forks: 0

mariamkhmahran/gunshot-detection-system

This repository contains the Python code for a audio classification system designed to detect gunshots in urban settings.

Language: Jupyter Notebook - Size: 572 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

renesemela/masters-thesis-music-autotagging

Master's Thesis: Automatic Tagging of Musical Compositions Using Machine Learning Methods

Language: Python - Size: 545 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 3

Rumeysakeskin/dtw-compare-audio-files

Compute the MFCCs and measure (dis)similarity between two audio files using DTW

Language: Python - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

ajayKumar99/Music-Genre-Classification

A tensorflow application of CNN based music genre classifier which classifies an audio clip based on it's Mel Spectrogram and a RestAPI for inference using tensorflow serving

Language: Jupyter Notebook - Size: 5.94 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 0

andyharless/paces

Music Pace Compatibility Project

Language: Jupyter Notebook - Size: 35.7 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

filipmu/Kaggle-freesound-audio-tagging-2019

My best submission to this Kaggle contest

Language: Jupyter Notebook - Size: 1.43 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

riccardomalpiedi/musicgenreclassification

Music genre classification with CNN (exam project)

Language: Jupyter Notebook - Size: 1.19 GB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

DanielMPMatCom/Identifying-Colines.-JCE-MatCom

Research on the structure of the song choruses of the frog species Eleutherodactylus eileenae. Obtaining song sequences.

Language: Jupyter Notebook - Size: 23.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

SimpleKidd/Fault-Diagnosis-of-a-Rotor-Bearing-System-using-ML

Analyzing Vibrational Data of the System using Machine Learning

Language: Jupyter Notebook - Size: 7.69 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

to-schi/ASR-Deepspeech2-Tensorflow

An end-to-end speech recognition engine similar to DeepSpeech2

Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

poludmik/Music-Genre-Classification

A pet project on music genre classification. Assigning the correct genre to the provided audio track.

Language: Python - Size: 5.41 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

sh3r4zhassan/Sound-Prediction-and-Cancellation-Model

This Model analyzes and predicts the input sound and then using pretrained ANC systems cancels the input sound.

Language: Jupyter Notebook - Size: 409 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 2

bayuwira/Kendang-Tunggal-Classification-Using-Backpropagation-and-Onset-Detection

Bali has a diversity of arts that has been recognized by the world, where one of the most famous Balinese arts is the Karawitan art, especially the Kendang Tunggal instrument. Notation documentation or more commonly known as music transcription, can make learning a song easier, and in the case of this research, it makes it easier to learn to play the Kendang Tunggal instrument. The first approach method used to document a kendang tunggal song is onset detection. Onset is when the signal experiences an attack period, which helps segment the sound color of the drum instrument. The segmented kendang tunggal sound color classification uses the Backpropagation algorithm with several features of the frequency domain and time domain as a characteristic of the sound color. Then the kendang tunggal song is revived into a synthetic sound with the Mel Spectral Approximation filter. Based on the research, the optimal parameter for drum sound color segmentation with onset detection is the hop size 110 with normalization of the features on its onset detection function. The optimal backpropagation architecture obtained with a learning rate of 0.9, neurons 10, and epoch 2000 produces an accuracy of 60.85%. The synthesis method using the Mel Log Spectrum Approximation can make synthetic sounds similar to kendang songs with an accuracy of 83.33%

Language: Python - Size: 678 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

Marini97/Audio-CNN

Project to classify wav audio files using a CNN.

Language: Jupyter Notebook - Size: 1.76 GB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Saurabh620/Voice-Signal-Processing-using-Python-GUI

In this project we used TESS voice dataset and processed it and perform emotion prediction.

Language: Jupyter Notebook - Size: 2.99 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

reece-iriye/Saxophone-Hero

Step onto the stage with Saxophone Hero, where your tenor saxophone is the key to unlocking a rhythmic adventure through a world of sheet music. In this game, your character scores points by hitting the right notes. Powered by machine learning, the game captures the pitch from your saxophone and translates it to player movement in real time.

Language: Jupyter Notebook - Size: 70.7 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

mtxslv/autoencoder-chiptune

Embed chiptunes in 2D with Convolutional Auto Encoder and Mel Spectrograms

Language: Jupyter Notebook - Size: 1.07 GB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

anirudhs123/Music-Instrument-Classification

In this project we use a Lightweight-CNN based model to classify instruments from the Freesound audio data set. We make use of Mel-Spectrogram features from the input audio data as the input to the CNN model. To add robustness to the model, we use a novel data augmentation technique based on the Cut-Mix algorithm.

Language: Jupyter Notebook - Size: 2.47 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

chandukasturi/Music-Genre-Classification

Implemented multiple classification models using Python with GTZAN data set by leading a team of three people. Compared the performances of K-NN, SVM, CNN models and logged their results in terms of prediction accuracies. A Convolutional Neural Network model stood out with the highest prediction accuracy of 82% amongst all other models.

Language: Jupyter Notebook - Size: 570 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

COINS-SS21/moody-ser

Speech emotion recognition models for the Moody web application.

Language: Jupyter Notebook - Size: 144 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

dangpzanco/dcase-task1

Acoustic Scene Classification System (DCASE2018 Task 1)

Language: Python - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 3

Nickaine1/Music-Genre-Recognition

Music-genre-classification-using-deep-learning

Size: 3.91 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

idaishe/Music-Genre-Recognition

Music-genre-classification-using-deep-learning

Size: 2.93 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

LHPT2009/Music-Genre-Recognition

Music genre classification using deep learning

Size: 0 Bytes - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

dxspeeder/Music-Genre-Recognition

Music genre classification using deep learning

Size: 5.86 KB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

DevinWSoTuff/Music-Genre-Recognition

Music genre classification using deep learning

Size: 5.86 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

pavlosdais/Music-Genre-Recognition

Music genre classification using deep learning

Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mradovic38/dtw-speech-recognition

Speech recognition system that uses feature extraction and dynamic time warping (DTW) to identify words and to find the most similar speaker.

Language: Python - Size: 29.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

NajdBinrabah/Deep-Learning-with-TensorFlow-and-Keras

This project explores emotion recognition in audio data, focusing on feature extraction techniques while also comparing the performance of LSTM and 1D CNN models.

Language: Jupyter Notebook - Size: 855 KB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

neurlang/gomel

Golang Mel Spectrogram and Spectrogram inversion

Language: Go - Size: 43.9 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

UofTNeurology/masa-open-source

Open Source Repository for the MASA Project

Language: Jupyter Notebook - Size: 758 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aaolcay/Some-Signal-Processing-Stuff

Different Signal Processing Tasks

Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Balajirvp/Dynamic-Time-Warping

Leveraged Dynamic Time Warping (DTW) to assess the similarity between specific audio tracks

Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

BScUniversityCollaborations/automatic-speech-recognition

Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.

Language: Python - Size: 8.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

imane-ayouni/Emotional-classification-through-voice-using-Backpropagation

Simple neural net to classify the emotion in an audio

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Themiscodes/Music-Classification-Melgrams

Music genre recognition with Convolutional Neural Networks (CNN) using Mel Spectrograms

Language: Jupyter Notebook - Size: 450 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

papari1123/MAIC_VOICE_AI_Challenge_2021

MAIC VOICE AI 대회. 음성 멜-스펙트럼 데이터를 이용한 음성 질환 진단 및 분류.

Size: 3.4 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Karthick47v2/mock-buddy-audio-server

audio processing service for mock-buddy

Language: PureBasic - Size: 477 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

dvd125/Classification-of-musical-genres-and-music-retrival

Language: Jupyter Notebook - Size: 8.55 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nouranHisham/speech_emotion_recognition

This project was for the pattern recognition course I studied in college. This was the beginning of dealing with neural networks and 2 CNN models were made, 1-d model and 2-d model to deal with different forms of the data, audio and image, respectively.

Language: Jupyter Notebook - Size: 3.7 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

shinshoji01/AM_with_GAN_for_melspectrogram

This repository is to introduce the application of Activation Maximization for audio-domain data.

Language: Jupyter Notebook - Size: 25.3 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

purang2/MUSIC-TO-SENTIMENT

2021-1 뇌인지공학 Term Project [👀🤜ing~ 06/24]

Language: Jupyter Notebook - Size: 1.23 GB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

hcy71o/Speech_Preprocessing

Overall process of speech signal processing (Mel-spectrogram & MFCCs) and loading data using Pytorch dataloader

Language: Jupyter Notebook - Size: 3.53 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

vimlord/torchrosa-tts

A text-to-speech program using VAE on Mel spectrograms of phonemes.

Language: Python - Size: 877 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

OldBonhart/dsp_for_dl

Introduction to Digital Signal Processing for Machine Learning

Language: Jupyter Notebook - Size: 24.4 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

Related Topics
machine-learning 21 audio-processing 21 mfcc 18 deep-learning 18 python 12 librosa 11 cnn 11 audio-classification 11 audio-analysis 10 pattern-recognition 8 genre-classification 8 tensorflow 8 pytorch 7 music-analysis 7 keras 7 music-genre-recognition 7 convolutional-neural-networks 6 mel-frequency-cepstral-coefficients 6 speech-emotion-recognition 6 stft 5 speech-recognition 4 short-time-fourier-transform 4 spectrogram 4 speech 4 cnn-keras 4 classification 4 discrete-cosine-transform 4 dct 4 cqt-spectrogram 4 cnn-classification 4 constant-q-transform 4 signal-processing 4 deep-neural-networks 4 audio 4 tts 3 modified-discrete-cosine-transform 3 mel-filterbank 3 mdct 3 dst 3 discrete-sine-transform 3 cqt-kernel 3 chromagram 3 audio-signal-processing 3 neural-network 3 dtw 3 vocoder 3 music 3 fourier-transform 3 jupyter-notebook 2 sound-classification 2 feature-extraction 2 keras-tensorflow 2 lstm 2 lpc 2 long-short-term-memory 2 text-to-speech 2 data-science 2 wavernn 2 numpy 2 resnet 2 emotion-detection 2 data-augmentation 2 melgram 2 sound 2 music-information-retrieval 2 recurrent-neural-network 2 synthesis 2 inverse-mdct 2 inverse-stft 2 convolutional-neural-network 2 speech-synthesis 2 speech-processing 2 voice-cloning 2 dynamic-programming 2 spectrograms 2 dynamic-time-warping 2 mfcc-features 2 tempogram 2 music-genre-classification 2 fastfouriertransform 2 wavenet 2 gtzan-dataset 1 k-nearest-neighbours 1 emotion-recognition 1 voice-recognition 1 python3 1 rnn 1 voice 1 class-activation-maps 1 cutmix-augmentation 1 hyperparameter-tuning 1 pruning 1 encodings 1 speaker-embeddings 1 speaker-encodings 1 euclidean-distances 1 moviepy 1 audio-recognition 1 backpropagation 1 decibels 1