Topic: "speech-classification"
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Language: Jupyter Notebook - Size: 2.35 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 872 - Forks: 165

YuanGongND/ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
Language: Python - Size: 11.7 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 309 - Forks: 47

m3hrdadfi/soxan
Wav2Vec for speech recognition, classification, and audio classification
Language: Jupyter Notebook - Size: 3.57 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 197 - Forks: 28

kaistmm/Audio-Mamba-AuM
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
Language: Python - Size: 10.7 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 108 - Forks: 13

HoseinAzad/Transformer-based-SER
Transformer-based model for Speech Emotion Recognition(SER) - implemented by Pytorch
Language: Python - Size: 25.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 24 - Forks: 2

felixchenfy/Speech-Commands-Classification-by-LSTM-PyTorch
Classification of 11 types of audio clips using MFCCs features and LSTM. Pretrained on Speech Command Dataset with intensive data augmentation.
Language: Jupyter Notebook - Size: 33.9 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 21 - Forks: 13

Rumeysakeskin/Speech-Command-Recognition
Classify input audio segment into categories for keyword spotting with MatchboxNet with training, exporting onnx model, accelerating inference via TensorRT
Language: Python - Size: 349 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 0

anik8gupta/Toxic_Speech_Classification
It is a full-fetched web application.Based on sentiment classification, by using nltk library it predicts that a speech is how much toxic, sever toxic, insult, obscene, threat.
Language: Python - Size: 202 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 12 - Forks: 2

Sreyan88/Toxicity-Detection-in-Spoken-Utterances
This repository contains the code for the paper: "DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances"
Language: Jupyter Notebook - Size: 976 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 5

Jason-Oleana/speech-classification
In this challenge, the goal is to learn to recognize which of several English words is pronounced in an audio recording. This is a multiclass classification task.
Language: Jupyter Notebook - Size: 7.03 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 2

Rumeysakeskin/Speech-Data-Augmentation
Speech dataset processing and augmentation (add background noise and change speech pitch) for speech recognition
Language: Jupyter Notebook - Size: 22.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

EmanuelAlogna/Gender-Classification-using-ML
Gender Classification with different Machine Learning models, using the LibriSpeech ASR dataset.
Language: Jupyter Notebook - Size: 146 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 3

Rayyan9477/speech_emotion_classification
This project implements a speech emotion classification system using neural networks and genetic algorithms for optimization. The system classifies emotions such as calm, happy, sad, angry, fearful, surprise, and disgust from speech audio using the RAVDESS dataset.
Language: HTML - Size: 20.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 0

deep-spin/speech-continuous-attention
Speech Classification using Continuous Attention Mechanisms
Language: Python - Size: 191 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

Mubarekethio/Voice-Recognition-Qafaraf-and-Amharic
Qafar-af and Amharic voice Command Recognition project to control the movement of wheelchair
Language: Jupyter Notebook - Size: 137 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Chris-Winnard/Speech-Gender-Classifier
A convolutional neural network for gender classification, which achieved an F1-score of 94.3% when tested on the RAVDESS dataset. Created as postgraduate coursework, the report is included. The report also discusses Sodiq Adebiy's CNN, which I'd recommend looking at to anyone interested in emotion classification.
Language: Jupyter Notebook - Size: 561 KB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

sarthak268/Multimedia-Computing-and-Applications
This repository contains code for all assignments in the Multimedia Computing and Applications (CSE563) course.
Language: Python - Size: 3.68 MB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

OgeNI/BVC_Challenging_Voice_Set
A database of challenging voice utterances collected by the Biometrics Vision and Computing (BVC) group.
Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Amir-Hofo/Speech-commands-Classification
In this notebook, we aim to recognize speech commands using classification. For this purpose, we used the SPEECHCOMMANDS dataset and the deep convolutional model M5. The code is written in Python and designed for the PyTorch platform.
Language: Jupyter Notebook - Size: 827 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

KrajShuffle/Classifying_SpeechAudio_CNN
CNN Based Approach for Audio File Classification. Contains Notebooks Illustrating Data Preprocessing, Feature Extraction, Model Training, & Model Inference Workflows & Overall Pipeline
Language: Jupyter Notebook - Size: 37.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MilanaShhanukova/uni-research-dementia-detection
This project represents my research on dementia classification using audio data.
Language: Jupyter Notebook - Size: 5.62 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ryanquinnnelson/CMU-11685-Utterance-to-Phoneme-Mapping
Fall 2021 Introduction to Deep Learning - Homework 3 Part 2 (RNN-based phoneme recognition)
Language: Python - Size: 954 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

manashpratim/Frame-Level-Classification-of-Speech
Language: Jupyter Notebook - Size: 94.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

vishaal27/IFN-Python
A Python implementation of the Iterative Feature Normalization algorithm
Language: Jupyter Notebook - Size: 416 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1
