An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: voice-activity-detection

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Language: Python - Size: 100 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 9,912 - Forks: 995

xaionaro-go/audio

A package for Go to playback, record and process audio

Language: Go - Size: 168 KB - Last synced at: about 5 hours ago - Pushed at: about 6 hours ago - Stars: 3 - Forks: 0

ZygoteCode/VadSharp

Enterprise VAD (Voice Activity Detection) in C#.NET (.NET 6.0+) with Microsoft.ML.Net, ONNXRuntime and DirectML. The easiest, efficient, and performant Silero VAD implementation! Always open for PRs.

Language: C# - Size: 354 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 12 - Forks: 1

noisetorch/NoiseTorch

Real-time microphone noise suppression on Linux.

Language: Go - Size: 5.87 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 9,584 - Forks: 227

mvalancy-mt/logitech_bcc950

A talking eyeball on a stick - Logitech BCC950 PTZ camera control scripts

Language: Python - Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

KarthikaRajagopal44/Beyond-Voice-Activity-Detection-Advanced-Turn-End-Prediction-in-Conversational-Agents

Moving Past VAD: Smarter Turn-Taking in Voice Assistants

Language: Python - Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

techAli1996/wakeword

ESP32S3 Wakeword/Keyword Spotting starter project with ready to go ML model

Language: C - Size: 4.69 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language: Jupyter Notebook - Size: 252 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 7,277 - Forks: 866

webyneter/speech-to-console

A voice-controlled tool that converts spoken commands to text in your terminal

Language: Python - Size: 267 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

juanmc2005/diart

A python package to build AI-powered real-time audio applications

Language: Python - Size: 34.8 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 1,240 - Forks: 96

egorsmkv/pyannote-onnx-rust

Run Voice Activity Detection model PyAnnote using ONNX and Rust

Language: Rust - Size: 5.17 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

gtreshchev/RuntimeAudioImporter 📦

Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

Language: C++ - Size: 10.1 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 382 - Forks: 79

snakers4/silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language: Python - Size: 100 MB - Last synced at: 9 days ago - Pushed at: 29 days ago - Stars: 5,550 - Forks: 536

nianlonggu/WhisperSeg

Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection

Language: Python - Size: 243 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 29 - Forks: 9

ggeop/Python-ai-assistant

Python AI assistant 🧠

Language: Python - Size: 2.99 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 966 - Forks: 250

pykeio/earshot

Ridiculously fast voice activity detection in pure #[no_std] Rust

Language: Rust - Size: 879 KB - Last synced at: 6 days ago - Pushed at: 6 months ago - Stars: 15 - Forks: 1

Picovoice/cobra

On-device voice activity detection (VAD) powered by deep learning

Language: Python - Size: 42.7 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 206 - Forks: 14

shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 388 - Forks: 49

mgonzs13/whisper_ros

Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2

Language: C++ - Size: 1.89 MB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 73 - Forks: 17

k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

Language: C++ - Size: 2.04 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 1,262 - Forks: 172

ina-foss/inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language: Python - Size: 36.6 MB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 797 - Forks: 136

gkonovalov/android-vad

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

Language: C - Size: 5.18 MB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 323 - Forks: 69

BingLingGroup/autosub Fork of iWangJiaxiang/autosub

Command-line utility to transcribe/translate from video/audio/subtitles to subtitles

Language: Python - Size: 1.29 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 1,986 - Forks: 245

Paradeluxe/Praditor

Praditor: A DBSCAN-Based Automation for Speech Onset Detection

Language: Python - Size: 109 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 0

jim-schwoebel/voice_gender_detection

♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).

Language: Python - Size: 9.57 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 82 - Forks: 25

bigcash/awesome-vad

A curated list of awesome voice activity detection

Size: 9.77 KB - Last synced at: 11 days ago - Pushed at: 5 months ago - Stars: 48 - Forks: 2

rohanprichard/fastrtc-demo

A simple POC of FastRTC, a framework to use voice mode in python!

Language: TypeScript - Size: 89.8 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 24 - Forks: 9

jim-schwoebel/voicebook

🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).

Language: Python - Size: 299 MB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 380 - Forks: 85

duj12/ASR-2Pass

ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).

Language: HTML - Size: 86.9 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 64 - Forks: 8

zhenghuatan/rVADfast

This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

Language: Python - Size: 3.62 MB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 137 - Forks: 23

baochuquan/ios-vad

iOS Voice Activity Detection (VAD). Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

Language: Swift - Size: 4.5 MB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 10 - Forks: 0

DictationDaddy/VAD_WEB_DEMO

In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.

Language: JavaScript - Size: 2 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 20 - Forks: 1

baxtree/subaligner

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

Language: Python - Size: 103 MB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 467 - Forks: 18

coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Size: 139 KB - Last synced at: 27 days ago - Pushed at: 11 months ago - Stars: 1,318 - Forks: 142

jim-schwoebel/voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

Size: 136 KB - Last synced at: 27 days ago - Pushed at: 11 months ago - Stars: 1,875 - Forks: 237

jsvir/vad

[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection

Language: Python - Size: 1.71 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 26 - Forks: 3

aidayang/FunASR-OneClick

FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件

Size: 5.86 KB - Last synced at: 17 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

Ave-Sergeev/Dictator

Speech-to-Text translation service (Rust, Tonic) (2025)

Language: Rust - Size: 49.3 MB - Last synced at: 19 days ago - Pushed at: 29 days ago - Stars: 6 - Forks: 0

bunyaminergen/Callytics

Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.

Language: Python - Size: 23.9 MB - Last synced at: 19 days ago - Pushed at: about 1 month ago - Stars: 65 - Forks: 10

idiap/zff_vad

Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

Language: Python - Size: 631 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 1

amsehili/auditok

An audio/acoustic activity detection and audio segmentation tool

Language: Python - Size: 3.68 MB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 769 - Forks: 95

daanzu/py-silero-vad-lite

Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies

Language: Python - Size: 1.9 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 10 - Forks: 1

ina-foss/InaGVAD

Voice activity detection and speaker gender segmentation audiovisual corpus

Language: Jupyter Notebook - Size: 1.4 MB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 1

oadultradeepfield/healthhack-vad

A VAD service analyzing factors linked to cognitive decline, developed for the HealthHack 2025 project.

Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

numq/voice-activity-detection

JVM library for voice activity detection written in Kotlin based on C library fvad and Silero

Language: Kotlin - Size: 2.94 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

alexnaughtonjr/Real-Time-Voice-Cloning Fork of CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Size: 352 MB - Last synced at: 17 days ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 0

Speech-Interaction-Technology-Aalto-U/itsp

Introduction to Speech Processing

Language: Jupyter Notebook - Size: 254 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 82 - Forks: 15

spokestack/spokestack-android 📦

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!

Language: Java - Size: 1.25 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 72 - Forks: 8

ganlvtech/bing-stt

Rust implementation of bing "Search using voice" button speech recognition API (similar to Azure real-time speech to text API)

Language: Rust - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

ElmiraGhorbani/gpt-speaker-diarization

Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.

Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

smacke/ffsubsync

Automagically synchronize subtitles with video.

Language: Python - Size: 3.69 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 6,994 - Forks: 290

panmasuo/voice-activity-detection

Voice activity detection algorithm written in C

Language: C - Size: 43.9 KB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 3

bannawandoor27/Teletrix

Smart audio mixer that automatically mutes test tones when you speak - perfect for audio testing and development.

Language: Go - Size: 2.36 MB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

itmo-mbss-lab/sr_lectures_book

The project is related to the development of Basics of Voice Biometrics lecture book for the ITMO Speaker Recognition Course.

Language: TeX - Size: 1.15 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

spokestack/react-native-spokestack 📦

Spokestack: give your React Native app a voice interface!

Language: TypeScript - Size: 6.52 MB - Last synced at: about 9 hours ago - Pushed at: almost 3 years ago - Stars: 60 - Forks: 13

thurti/vad-audio-worklet

Voice Activity Detection (VAD) AudioWorklet

Language: JavaScript - Size: 762 KB - Last synced at: 2 days ago - Pushed at: 11 months ago - Stars: 13 - Forks: 4

jim-schwoebel/pauses

🎤 quick library to extract pause lengths from audio files.

Language: Python - Size: 2.13 MB - Last synced at: 10 days ago - Pushed at: almost 6 years ago - Stars: 31 - Forks: 7

nyumaya/libnyumaya_esp32

Experimental support for nyumaya audio recognition on ESP32

Language: C++ - Size: 5.35 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 1

HolgerBovbjerg/SSL-PVAD

A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS"

Language: Python - Size: 5.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

tomchang25/whisper-auto-transcribe

Auto transcribe tool based on whisper

Language: Python - Size: 169 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 220 - Forks: 15

filippogiruzzi/voice_activity_detection

Voice Activity Detection based on Deep Learning & TensorFlow

Language: Python - Size: 238 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 355 - Forks: 69

kristofferv98/SemanthaVoiceAssistant

A comprehensive AI companion leveraging advanced semantic analysis, sentiment detection, and voice processing to provide personalized and context-aware interactions using Autogen, semantic-router, and VoiceProcessingToolkit.

Language: Python - Size: 85 KB - Last synced at: 23 days ago - Pushed at: 11 months ago - Stars: 5 - Forks: 0

jtkim-kaist/VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

Language: MATLAB - Size: 261 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 842 - Forks: 235

spokestack/spokestack-ios 📦

Spokestack: give your iOS app a voice interface!

Language: Swift - Size: 9.94 MB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 42 - Forks: 8

NickWilkinson37/voxseg

A python library for voice activity detection (VAD) for speech/non-speech segmentation.

Language: Python - Size: 98.1 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 83 - Forks: 12

RicherMans/GPV

Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper

Language: Python - Size: 8.85 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 142 - Forks: 29

eesungkim/Voice_Activity_Detector

A statistical model-based Voice Activity Detection

Language: Jupyter Notebook - Size: 168 KB - Last synced at: 5 months ago - Pushed at: over 6 years ago - Stars: 189 - Forks: 41

nicklashansen/voice-activity-detection

Voice Activity Detection (VAD) using deep learning.

Language: Jupyter Notebook - Size: 2.41 MB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 191 - Forks: 32

itmo-mbss-lab/sr_labs_book

The project is related to the development of labs for the ITMO Speaker Recognition Course.

Language: Jupyter Notebook - Size: 3.25 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 8

krithicswaroopan/AI-Voice-Assistance-Pipeline

A real-time voice-to-text and text-to-speech AI pipeline using Whisper, an LLM, and Edge-TTS with tunable parameters for low-latency audio processing and response generation.

Language: Python - Size: 80.2 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

sooftware/End-to-End-Speech-Recognition-Models

PyTorch implementation of automatic speech recognition models.

Language: Python - Size: 84 KB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 38 - Forks: 5

mechanicalsea/spectra

Spectra extraction tutorials based on torch and torchaudio.

Language: Jupyter Notebook - Size: 3.31 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 4

IntendedConsequence/vadc

Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech

Language: C++ - Size: 8.45 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 0

developerdaya/MyFirstJarvisApp

Experience the power of a personal assistant with MyFirstJarvisApp! Perform tasks, get information, and manage your daily life seamlessly with voice commands. Perfect for anyone looking to simplify their routine and harness AI technology.

Language: Kotlin - Size: 24.7 MB - Last synced at: 8 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

Yifei-ZHAO96/Tr-VAD

Tr-VAD: An Efficient Transformer based Voice Activity Detection Model

Language: Python - Size: 5.55 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

binglel/vad_lrt_hmm

A statistical model-based Voice Activity Detector

Language: Python - Size: 725 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 1

jim-schwoebel/nala

🦁 Nala is an agile open-source voice assistant framework (20+ actions).

Language: Python - Size: 40.7 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 15

OpenVoiceOS/ovos-vad-plugin-silero

ovos plugin for voice activity detection using silero vad

Language: Python - Size: 1.78 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

neemiasbsilva/datascience-portfolio

Hello guys, welcome to my Data Science Portfolio. I include some knowledges I earn in my journey. I included some case study, papers, and code. Please check the readme.

Language: Jupyter Notebook - Size: 53.4 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

tez3998/realtime-vad-sample

Sample code of real-time voice activity detection using webrtcvad.

Language: Python - Size: 9.77 KB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

sshh12/Conv-VAD

A packaged convolutional voice activity detector for noisy environments.

Language: Python - Size: 15.6 KB - Last synced at: 19 days ago - Pushed at: almost 6 years ago - Stars: 14 - Forks: 2

Mohamedhany99/Voice-Frequency-Extraction-Signal-Processing-

This Script is able to extract Frequency of the voice detected in an audio file (preferred in ".wav" filetype)

Language: Python - Size: 94.7 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 1

zhenghuatan/rVAD

Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.

Language: MATLAB - Size: 1.05 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 29

egorsmkv/audio-katana

A tool to slice your audio files into chunks using the Voice Activity Detection technique

Language: Python - Size: 4.42 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 2

pranshurastogi29/uis_rnn_for_speaker_diarization

speaker_diarization done on toy dataset and tested on timit dataset

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

gooofy/py-nltools

A collection of basic python modules for spoken natural language processing

Language: Python - Size: 413 KB - Last synced at: 8 days ago - Pushed at: over 5 years ago - Stars: 56 - Forks: 15

bbc/bbc-speech-segmenter

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Language: Shell - Size: 62.6 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 22 - Forks: 2

bincrafters/conan-libfvad

Conan.io package for libfvad project

Language: Python - Size: 33.2 KB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Js-Mim/wagner_vad

Language: HTML - Size: 7.13 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 1

AmirHHasani/Automatic-Volume-Adjustment-AVA-

A project with the goal of controlling audio systems automatically

Language: Jupyter Notebook - Size: 187 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

kdffdwsfgdw43331/iidia

Size: 1.95 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

adscmksdfdasf9/changer

Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

voicechange1/changer

Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

itmo-mbss-lab/dsp_labs_book

The project is related to the development of labs for the ITMO Digital Signal Processes

Language: Jupyter Notebook - Size: 3.1 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 13

PiotrTa/Huawei-Challenge-Speaker-Identification

Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.

Language: Jupyter Notebook - Size: 33.3 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 36 - Forks: 10

pdadial/Speech_Emotion_Recognition_CNN-LSTM

CNN-LSTM based SER model using RAVDESS database

Language: Jupyter Notebook - Size: 202 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 2

sypai/co-oCCur

co-oCCur: High-speed synchronization tool

Language: C++ - Size: 35.9 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

DevExpert0101/SpeechDoctor

Analyze an audio file and count words, sentences and timestamps, filler words

Language: Jupyter Notebook - Size: 5.15 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

usc-sail/mica-speech-activity-detection

Robust Speech Activity Detection (SAD) in movie audio

Language: Python - Size: 61.9 MB - Last synced at: 11 months ago - Pushed at: about 4 years ago - Stars: 25 - Forks: 10

voithru/voice-activity-detection

Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

Language: Python - Size: 11.3 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 125 - Forks: 21

Related Keywords
voice-activity-detection 155 vad 46 speech-recognition 34 voice-recognition 26 speech-processing 21 voice 20 python 20 voice-assistant 20 machine-learning 19 speech-to-text 19 voice-control 17 deep-learning 16 speech 16 pytorch 15 voice-commands 14 audio 12 asr 11 speech-activity-detection 10 audio-processing 10 speaker-diarization 8 whisper 8 voice-synthesis 8 voice-chat 8 speech-detection 7 text-to-speech 7 speaker-recognition 7 silero-vad 6 stt 6 tensorflow 6 speech-segmentation 6 voice-detection 6 webrtc 6 transcription 5 silero 5 subtitles 5 android 5 deep-neural-networks 5 voice-computing 5 neural-networks 5 tts 5 speaker-verification 5 speech-api 4 voice-conversion 4 audio-segmentation 4 cpp 4 wakeword 4 speaker-identification 4 speech-synthesis 4 onnx 4 voice-activity-detector 4 onnxruntime 4 natural-language-processing 4 mfcc 4 hacktoberfest 4 openai 4 offline 3 dnn 3 voice-changer 3 mfcc-features 3 rust 3 keras 3 gmm 3 real-time 3 fastapi 3 voice-changer-download 3 speaker-embedding 3 noise-robust 3 pulseaudio 3 signal-processing 3 automatic-speech-recognition 3 csharp 3 speech-emotion-recognition 3 gender-classification 3 dataset 3 acoustic-features 3 ios 3 wake-word-detection 3 lstm 3 cnn 3 speech-analysis 3 ovos 3 data 3 c 3 audio-analysis 3 gender 2 gender-equality 2 on-device-ai 2 embedding-models 2 domain-adaptation 2 decision-theory 2 calibration 2 noise 2 speaker-gender 2 automatic-voice-activity-detection 2 mlp 2 vosk 2 diarization 2 forced-alignment 2 voice-datasets 2 voice-dataset 2