audio-classification | Topic | Ecosyste.ms: Repos

Topic: "audio-classification"

microsoft/Semi-supervised-learning

A Unified Semi-Supervised Learning Codebase (NeurIPS'22)

Language: Python - Size: 1.64 MB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 1,475 - Forks: 196

YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language: Jupyter Notebook - Size: 2.35 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 872 - Forks: 165

aqibsaeed/Urban-Sound-Classification

Urban sound classification using Deep Learning

Language: Jupyter Notebook - Size: 9.97 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 517 - Forks: 244

towhee-io/examples

Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc.

Language: Jupyter Notebook - Size: 289 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 491 - Forks: 118

seth814/Audio-Classification

Code for YouTube series: Deep Learning for Audio Classification

Language: Jupyter Notebook - Size: 97.7 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 461 - Forks: 172

HumanSignal/label-studio-frontend 📦

Data labeling react app that is backend agnostic and can be embedded into your applications — distributed as an NPM package

Language: JavaScript - Size: 102 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 431 - Forks: 321

RetroCirce/HTS-Audio-Transformer

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Language: Python - Size: 896 KB - Last synced at: 14 days ago - Pushed at: 10 months ago - Stars: 410 - Forks: 68

ksanjeevan/crnn-audio-classification

UrbanSound classification using Convolutional Recurrent Networks in PyTorch

Language: Python - Size: 3.48 MB - Last synced at: 16 days ago - Pushed at: almost 4 years ago - Stars: 390 - Forks: 80

YuanGongND/whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Language: Python - Size: 20.5 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 376 - Forks: 30

phurwicz/hover

:speedboat: Label data at scale. Fun and precision included.

Language: Python - Size: 294 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 327 - Forks: 19

YuanGongND/ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Language: Python - Size: 11.7 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 309 - Forks: 47

kkoutini/PaSST

Efficient Training of Audio Transformers with Patchout

Language: Python - Size: 610 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 278 - Forks: 47

drscotthawley/panotti

A multi-channel neural network audio classifier using Keras

Language: Python - Size: 1.39 MB - Last synced at: 17 days ago - Pushed at: almost 4 years ago - Stars: 269 - Forks: 69

IBM/MAX-Audio-Classifier

Identify sounds in short audio clips

Language: Python - Size: 38.2 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 154 - Forks: 53

cwx-worst-one/EAT

[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

Language: Python - Size: 6.51 MB - Last synced at: 13 days ago - Pushed at: about 1 month ago - Stars: 153 - Forks: 8

micah5/pyAudioClassification

🎶 dead simple audio classification

Language: Python - Size: 19.5 MB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 134 - Forks: 22

luuil/Tensorflow-Audio-Classification

Audio classification with VGGish as feature extractor in TensorFlow

Language: Python - Size: 7.46 MB - Last synced at: 6 months ago - Pushed at: over 3 years ago - Stars: 127 - Forks: 29

SiavashShams/ssamba

[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model

Language: Python - Size: 1.88 MB - Last synced at: 20 days ago - Pushed at: 8 months ago - Stars: 118 - Forks: 9

YuanGongND/psla

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Language: Python - Size: 7.09 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 118 - Forks: 14

sainathadapa/kaggle-freesound-audio-tagging 📦

8th place solution (on Kaggle) to the Freesound General-Purpose Audio Tagging Challenge (DCASE 2018 - Task 2)

Language: Python - Size: 31.3 KB - Last synced at: 5 days ago - Pushed at: over 4 years ago - Stars: 114 - Forks: 25

kaistmm/Audio-Mamba-AuM

Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"

Language: Python - Size: 10.7 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 108 - Forks: 13

jonnor/ESC-CNN-microcontroller

Environmental Sound Classification on Microcontrollers using Convolutional Neural Networks

Language: Jupyter Notebook - Size: 32.5 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 102 - Forks: 20

CVxTz/audio_classification

CNN 1D vs 2D audio classification

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 95 - Forks: 25

yeyupiaoling/AudioClassification-PaddlePaddle

基于PaddlePaddle实现的音频分类，支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型，还有多种预处理方法

Language: Python - Size: 541 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 94 - Forks: 16

ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning

List of articles related to deep learning applied to music

Language: TeX - Size: 5.2 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 94 - Forks: 11

y-kawagu/dcase2020_task2_baseline

DCASE2020 Challenge Task 2 baseline system

Language: Python - Size: 45.9 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 93 - Forks: 46

Audio-WestlakeU/audiossl

A library built for easier audio self-supervised training, downstream tasks evaluation

Language: Python - Size: 13.1 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 86 - Forks: 9

JohannesBuchner/spoken-command-recognition

A large, free audio sample database (10M words pronounced), a test bed for voice activity detection algorithms and for single-syllable word recognition

Language: Python - Size: 63.5 KB - Last synced at: about 2 months ago - Pushed at: over 7 years ago - Stars: 69 - Forks: 31

ArmDeveloperEcosystem/ml-audio-classifier-example-for-pico

ML Audio Classifier Example for Pico 🔊🔥🔔

Language: Jupyter Notebook - Size: 42 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 67 - Forks: 24

sarthak268/Audio_Classification_using_LSTM

Classification of Urban Sound Audio Dataset using LSTM-based model.

Language: Python - Size: 1.61 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 67 - Forks: 18

Westlake-AI/SemiReward

[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning

Language: Python - Size: 1.13 MB - Last synced at: 24 days ago - Pushed at: 12 months ago - Stars: 66 - Forks: 2

Kardbord/hfapigo

Unofficial (Golang) Go bindings for the Hugging Face Inference API

Language: Go - Size: 3.35 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 62 - Forks: 5

YuanGongND/vocalsound

Dataset and baseline code for the VocalSound dataset (ICASSP2022).

Language: Jupyter Notebook - Size: 10.5 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 62 - Forks: 8

Caldarie/flutter_tflite_audio

Audio classification Tflite package for flutter (iOS & Android). Can support Google Teachable Machine models

Language: Java - Size: 12.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 61 - Forks: 23

CouncilDataProject/speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.

Language: Python - Size: 17.7 MB - Last synced at: 1 day ago - Pushed at: 6 months ago - Stars: 56 - Forks: 6

chen0040/mxnet-audio

Implementation of music genre classification, audio-to-vec, song recommender, and music search in mxnet

Language: Python - Size: 15.7 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 55 - Forks: 15

vishalshar/Audio-Classification-using-CNN-MLP

Multi class audio classification using Deep Learning (MLP, CNN): The objective of this project is to build a multi class classifier to identify sound of a bee, cricket or noise.

Language: Python - Size: 2.46 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 54 - Forks: 17

melihbodur/Text_and_Audio_classification_with_Bert

Text Classification in Turkish Texts with Bert

Language: Jupyter Notebook - Size: 412 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 43 - Forks: 5

MTG/DCASE-models

Python library for rapid prototyping of environmental sound analysis systems

Language: Jupyter Notebook - Size: 133 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 42 - Forks: 5

pooya-mohammadi/audio-classification-pytorch

In this project, several approaches for training/finetuning an audio gender recognition is provided. The code can simply be used for any other audio classification task by simply changing the number of classes and the input dataset.

Language: Jupyter Notebook - Size: 871 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 41 - Forks: 4

channelCS/Audio-Vision

Implementation and reviews of Audio & Computer vision related papers in python using keras and tensorflow.

Language: Python - Size: 1.81 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 40 - Forks: 12

mdfirman/CityNet

A neural network classifier for urban soundscapes

Language: Jupyter Notebook - Size: 54.2 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 30 - Forks: 5

sainathadapa/dcase2019-task5-urban-sound-tagging 📦

1st place solution to the DCASE 2019 - Task 5 - Urban Sound Tagging

Language: Python - Size: 2.98 MB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 30 - Forks: 7

YuanGongND/uavm

Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".

Language: Python - Size: 3.28 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 29 - Forks: 0

DunnBC22/Vision_Audio_and_Multimodal_Projects

This repository includes all computer vision, audio, document AI, and multimodal projects.

Language: Jupyter Notebook - Size: 108 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 28 - Forks: 5

nextco/audio-classification

Audio Classification - Multilayer Neural Networks using TensorFlow

Language: Jupyter Notebook - Size: 30.4 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 28 - Forks: 4

otonomee/streamstem

Implements ML audio separation algorithm on audio from YouTube or Spotify resulting in "stems" for download (e.g. vocals, drums, bass) in MP3, WAV or FLAC.

Language: Python - Size: 186 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 27 - Forks: 3

y-kawagu/dcase2021_task2_baseline_ae

Autoencoder-based baseline system for DCASE2021 Challenge Task 2.

Language: Python - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 26 - Forks: 9

UmarIgan/Machine-Learning

A set of jupyter notebooks

Language: Jupyter Notebook - Size: 16.7 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 23 - Forks: 8

wikke/AudioRecognition

Google Speech Command Dataset Classification Neural Network, CNN, RNN

Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 23 - Forks: 12

emuell/AFEC

Cross platform audio feature extraction and sound classification tool

Language: C++ - Size: 128 MB - Last synced at: 5 days ago - Pushed at: 12 months ago - Stars: 22 - Forks: 4

zhihanyang2022/gender-audio-classification

A speaker gender classifier. MFC feature engineering and a pre-trained ResNet-50. GradCAM interpretation.

Language: Jupyter Notebook - Size: 489 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 22 - Forks: 5

asif-hanif/palm

[EMNLP 2024] Official code repository of paper titled "PALM: Few-Shot Prompt Learning for Audio Language Models" accepted in EMNLP 2024 conference.

Language: Python - Size: 17.8 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 21 - Forks: 0

y-kawagu/dcase2021_task2_baseline_mobile_net_v2

MobileNetV2-based baseline system for DCASE2021 Challenge Task 2.

Language: Python - Size: 49.8 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 19 - Forks: 8

harmanpreet93/audio_classification

Multi-class audio classification with MFCC features using CNN

Language: Jupyter Notebook - Size: 11.8 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 19 - Forks: 4

anujdutt9/Audio-Scene-Classification

Scene Classification using Audio in the nearby Environment.

Language: Jupyter Notebook - Size: 129 MB - Last synced at: 2 days ago - Pushed at: almost 6 years ago - Stars: 19 - Forks: 3

MaxiDonkey/DelphiHuggingFace

The Hugging Face API wrapper for Delphi leverages cutting-edge models to deliver powerful features, including object detection, music generation, text classification, sentiment analysis, image segmentation, speech-to-text transcription, and text generation.

Language: Pascal - Size: 666 KB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 18 - Forks: 4

johnmartinsson/differentiable-mel-spectrogram

The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer in neural networks".

Language: Python - Size: 2.1 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 18 - Forks: 0

MaryamBoneh/DeepLearning-Course

My exercises in the deep learning course

Language: Jupyter Notebook - Size: 215 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 2

ParitoshParmar/Piano-Skills-Assessment

Piano Skills Assessment [IEEE MMSP 2021]

Language: Python - Size: 854 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 17 - Forks: 2

sithu31296/audio-tagging

Easy to use Audio Tagging in PyTorch

Language: Python - Size: 3.22 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 17 - Forks: 4

agrija9/Avalinguo-Dataset-Speaker-Fluency-Level-Classification-Paper-

Code for paper "Speaker Fluency Level Classification using Machine Learning Techniques."

Language: Jupyter Notebook - Size: 24.7 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 17 - Forks: 7

koudounasalkis/voc2vec

This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.

Language: Python - Size: 19.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 0

gibbona1/neal

NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.

Language: R - Size: 502 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 16 - Forks: 1

awsaf49/sonics

[ICLR 2025] SONICS: Synthetic Or Not - Identifying Counterfeit Songs

Language: Python - Size: 1.75 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 15 - Forks: 3

f0k/birdclef2018

BirdCLEF 2018 implementation

Language: Shell - Size: 69.3 KB - Last synced at: about 2 months ago - Pushed at: about 6 years ago - Stars: 15 - Forks: 4

sainathadapa/mediaeval-2019-moodtheme-detection

4th position solution to the MediaEval - The 2019 Emotion and Themes in Music using Jamendo

Language: Jupyter Notebook - Size: 2.37 MB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 14 - Forks: 4

ZenvilleErasmus/RAVDESS-emotions-speech-audio-only

1,440 audio files (.wav), i.e. speech files, from 24 actors that are categorized into 8 separate emotions.

Language: Python - Size: 209 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 14 - Forks: 2

zabir-nabil/audioperm

A python library for generating different permutations of audible segments from audio files.

Language: Jupyter Notebook - Size: 7.55 MB - Last synced at: 13 days ago - Pushed at: almost 3 years ago - Stars: 13 - Forks: 2

hernanrazo/human-voice-detection

Binary classification problem that aims to classify human voices from audio recordings. Implemented using PyTorch and Librosa.

Language: Python - Size: 98.6 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 0

koudounasalkis/Audio-Speech-Tutorial

This repository contains a short introduction on the topic of audio and speech processing -- from basics to applications.

Language: Jupyter Notebook - Size: 44.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 1

Anwarvic/CNN-for-Raw-Waveforms

This is my PyTorch implementation of the "Very Deep Convolutional Neural Networks For Raw Waveforms" research paper published in 2016.

Language: Jupyter Notebook - Size: 6.87 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 12 - Forks: 5

satvik-venkatesh/audio-seg-data-synth

Artificially synthesising data for audio segmentation to improve music-speech detection

Language: Jupyter Notebook - Size: 20.9 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 12 - Forks: 3

chen0040/java-tensorflow-samples

Java sample codes on how to integrate with tensorflow

Language: Java - Size: 136 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 12 - Forks: 4

Labbeti/SSLH

Deep Semi-Supervised Learning with Holistic methods for audio classification.

Language: Jupyter Notebook - Size: 3.02 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 10 - Forks: 1

tomfran/urban-sound-classification

UrbanSound8K dataset classification using MLP and CNN

Language: Jupyter Notebook - Size: 8.11 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 2

yas-sim/openvino-sound-classification-demo-rt

Real-time version of sound_classification_demo in OpenVINO toolkit. Captures audio from microphone, do classification, and display result on the screen with illustration.

Language: Python - Size: 1.42 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 10 - Forks: 0

chen0040/java-audio-embedding

Audio classifier, encoder, and search engine in Java

Language: Java - Size: 175 MB - Last synced at: 2 months ago - Pushed at: about 7 years ago - Stars: 10 - Forks: 1

RBGTOP/Music-Genre-Recognition

Music genre classification using deep learning

Size: 1.95 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 9 - Forks: 0

gitmehrdad/FACE

Urban Sound Annotation and Classification

Language: Jupyter Notebook - Size: 55.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 9 - Forks: 4

ksasso1028/KCN-AUDIO-CLASSIFICATION

Classification of drum samples using dilated convolutions

Language: Python - Size: 208 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 3

ShreeshaN/AlcoAudio

Detect alcohol induced intoxication level from a voice sample

Language: Python - Size: 8.99 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 9 - Forks: 3

sarthak268/Audio-Classification-using-MFCC-and-Spectrogram

Audio classification using a simple SVM classifier making use of MFCC and Spectrogram features coded from scratch

Language: Python - Size: 240 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 1

SpringerNLP/Chapter4

Chapter 4: Basics of Deep Learning

Language: Jupyter Notebook - Size: 633 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 9 - Forks: 5

agrija9/Avalinguo-Audio-Set

Avalinguo Audio Dataset: Dataset for Speaker Fluency Level Classification

Size: 277 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 9 - Forks: 4

mgoltzsche/essentia-container

Docker container to retrieve musical information from audio data using Essentia extractors

Language: Dockerfile - Size: 9.77 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 1

FilipTirnanic96/mfcc_extraction

Implementation of Mel-Frequency Cepstral Coefficients (MFCC) extraction

Language: Python - Size: 46.7 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 2

SughoshKulkarni/WildWav

Bird sound identification web application

Language: HTML - Size: 15 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

abishek-as/Audio-Classification-Deep-Learning

We'll look into audio categorization using deep learning principles like Artificial Neural Networks (ANN), 1D Convolutional Neural Networks (CNN1D), and CNN2D in this repository. We undertake some basic data preprocessing and feature extraction on audio sources before developing models. As a result, the accuracy, training time, and prediction time of each model are compared. This is explained by model deployment, which allows users to load the desired sound output for each model that is successfully deployed, as will be addressed in more depth later.

Language: Jupyter Notebook - Size: 47.1 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 6