GitHub topics: voice-activity-detection
gbibbo/vad_benchmark
Privacy‑preserving VAD benchmark on domestic audio (CHiME‑Home): 8 models, accuracy vs efficiency.
Language: Python - Size: 18.4 MB - Last synced at: about 15 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 0

pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language: Jupyter Notebook - Size: 252 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 8,225 - Forks: 934

connorlyon10/MScProject
The repo for my MSc Data Science project, "Speech Processing in Real-Time with Convolutional Neural Networks".
Language: Jupyter Notebook - Size: 121 MB - Last synced at: about 18 hours ago - Pushed at: about 21 hours ago - Stars: 0 - Forks: 0

snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language: Python - Size: 104 MB - Last synced at: 1 day ago - Pushed at: 11 days ago - Stars: 6,749 - Forks: 627

ricky0123/vad
Voice activity detector (VAD) for the browser with a simple API
Language: TypeScript - Size: 4.9 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,571 - Forks: 220

pmbstyle/EchoTap
Transcribe anything localy, fast and safe.
Language: Python - Size: 360 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

juanmc2005/diart
A python package to build AI-powered real-time audio applications
Language: Python - Size: 34.8 MB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 1,440 - Forks: 105

iamsrikanthnani/pluely
The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly during meetings, interviews, and conversations without anyone knowing. Built with Tauri for native performance, just 10MB. Completely undetectable in video calls, screen shares, and recordings.
Language: TypeScript - Size: 113 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 551 - Forks: 65

stefanwebb/open-voice-activity-detection
Fully open-source and state-of-the-art Voice Activity Detection (VAD) models for academic research and commercial applications.
Language: Python - Size: 27.3 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
Language: C++ - Size: 2.11 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,474 - Forks: 194

zhenghuatan/rVADfast
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Language: Python - Size: 3.63 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 145 - Forks: 24

smacke/ffsubsync
Automagically synchronize subtitles with video.
Language: Python - Size: 3.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 7,332 - Forks: 297

baxtree/subaligner
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
Language: Python - Size: 103 MB - Last synced at: 1 day ago - Pushed at: about 1 month ago - Stars: 483 - Forks: 20

noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
Language: Go - Size: 5.87 MB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 9,814 - Forks: 242

FluidInference/FluidAudio
Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.
Language: Swift - Size: 13.7 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 560 - Forks: 65

dangvansam/livekit-plugins-tenvad
LiveKit plugin for TEN VAD: low-latency voice activity detection for real-time streaming, integrated with livekit-agents
Language: Python - Size: 3.45 MB - Last synced at: 1 day ago - Pushed at: 11 days ago - Stars: 3 - Forks: 1

modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language: Python - Size: 100 MB - Last synced at: 7 days ago - Pushed at: 22 days ago - Stars: 12,305 - Forks: 1,233

amsehili/auditok
An audio/acoustic activity detection and audio segmentation tool
Language: Python - Size: 3.68 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 798 - Forks: 98

mgonzs13/whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
Language: C++ - Size: 1.94 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 81 - Forks: 19

pykeio/earshot
Ridiculously fast voice activity detection in pure #[no_std] Rust
Language: Rust - Size: 879 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 22 - Forks: 1

pmbstyle/Alice
Alice is a smart desktop AI assistant application built with Vue.js, Vite, and Electron. Advanced memory system, function calling, MCP support and more.
Language: TypeScript - Size: 79.5 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 162 - Forks: 17

techAli1996/wakeword
ESP32S3 Wakeword/Keyword Spotting starter project with ready to go ML model
Language: C - Size: 4.68 MB - Last synced at: about 12 hours ago - Pushed at: 11 days ago - Stars: 2 - Forks: 0

TEN-framework/ten-vad
Voice Activity Detector(VAD) from TEN: low-latency, high-performance and lightweight
Language: C - Size: 9.6 MB - Last synced at: 11 days ago - Pushed at: 26 days ago - Stars: 1,350 - Forks: 114

duj12/ASR-2Pass
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
Language: HTML - Size: 86.9 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 76 - Forks: 10

ina-foss/inaSpeechSegmenter
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Language: Python - Size: 36.6 MB - Last synced at: 9 days ago - Pushed at: 8 months ago - Stars: 832 - Forks: 141

PranavMishra17/Streaming-Digit-Detector
A real-time audio digit classification system that recognizes spoken numbers (0-9) through live microphone streaming. Features multiple classification approaches including TTS APIs, Fourier analysis, MFCC, and MEL features with performance benchmarking and inference time tracking.
Language: Python - Size: 279 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

bigcash/awesome-vad
A curated list of awesome voice activity detection
Size: 9.77 KB - Last synced at: about 24 hours ago - Pushed at: 10 months ago - Stars: 62 - Forks: 3

Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning
Language: Python - Size: 43.2 MB - Last synced at: 15 days ago - Pushed at: 24 days ago - Stars: 227 - Forks: 14

spokestack/react-native-spokestack 📦
Spokestack: give your React Native app a voice interface!
Language: TypeScript - Size: 6.52 MB - Last synced at: 15 days ago - Pushed at: over 3 years ago - Stars: 62 - Forks: 13

RimAmarat/RealTimeSpeechRec
Real Time Speech Recognition with Voice Activity Detection using Pytorch
Language: Python - Size: 37.1 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

bincrafters/conan-libfvad
Conan.io package for libfvad project
Language: Python - Size: 33.2 KB - Last synced at: 16 days ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

Paradeluxe/Praditor
Praditor: A DBSCAN-Based Automation for Speech Onset Detection
Language: Python - Size: 195 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

RustedBytes/pyannote-onnx-rust
Run Voice Activity Detection model PyAnnote using ONNX and Rust
Language: Rust - Size: 5.17 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

gtreshchev/RuntimeAudioImporter 📦
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
Language: C++ - Size: 10.1 MB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 392 - Forks: 82

gooofy/py-nltools
A collection of basic python modules for spoken natural language processing
Language: Python - Size: 413 KB - Last synced at: 11 days ago - Pushed at: almost 6 years ago - Stars: 55 - Forks: 15

baochuquan/ios-vad
iOS Voice Activity Detection (VAD). Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Language: Swift - Size: 4.5 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 21 - Forks: 2

jim-schwoebel/voice_gender_detection
♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).
Language: Python - Size: 9.57 MB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 87 - Forks: 25

nianlonggu/WhisperSeg
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
Language: Python - Size: 243 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 33 - Forks: 12

Saga9103/t2yLLM
A voice assistant with local LLM as a backend
Language: Python - Size: 213 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

spokestack/spokestack-ios 📦
Spokestack: give your iOS app a voice interface!
Language: Swift - Size: 9.94 MB - Last synced at: 26 days ago - Pushed at: about 4 years ago - Stars: 45 - Forks: 9

gkonovalov/android-vad
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Language: C - Size: 5.16 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 374 - Forks: 79

shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 435 - Forks: 59

daanzu/py-silero-vad-lite
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
Language: Python - Size: 1.9 MB - Last synced at: about 11 hours ago - Pushed at: 9 months ago - Stars: 15 - Forks: 1

Speech-Interaction-Technology-Aalto-U/itsp
Introduction to Speech Processing
Language: Jupyter Notebook - Size: 254 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 97 - Forks: 16

tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
Language: Python - Size: 169 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 226 - Forks: 16

OpenVoiceOS/ovos-vad-plugin-silero
ovos plugin for voice activity detection using silero vad
Language: Python - Size: 1.81 MB - Last synced at: about 7 hours ago - Pushed at: about 9 hours ago - Stars: 0 - Forks: 2

OpenVoiceOS/ovos-vad-plugin-webrtcvad
ovos plugin for voice activity detection using webrtcvad
Language: Python - Size: 27.3 KB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

thurti/vad-audio-worklet
Voice Activity Detection (VAD) AudioWorklet
Language: JavaScript - Size: 762 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 5

ZygoteCode/VadSharp
Enterprise VAD (Voice Activity Detection) in C#.NET (.NET 6.0+) with Microsoft.ML.Net, ONNXRuntime and DirectML. The easiest, efficient, and performant Silero VAD implementation! Always open for PRs.
Language: C# - Size: 354 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

pranjal-pravesh/stt-silero-whisper
Real-time speech to text using voice activity detection (with silero-VAD) and transcriptions using faster-whisper model
Language: Python - Size: 35.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

aidayang/FunASR-OneClick
FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件
Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

Swanand-Wagh/Socraitive
Language: TypeScript - Size: 14 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 2

idiap/zff_vad
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Language: Python - Size: 631 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 21 - Forks: 2

edyamza/Voice-Activity-Detection-WebRTC-Silero
This is a python project. We compare the metrics of 2 already trained AI models - WebRTC & Silero.
Language: Python - Size: 11.7 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

ggeop/Python-ai-assistant
Python AI assistant 🧠
Language: Python - Size: 2.99 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 977 - Forks: 247

jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Language: Python - Size: 299 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 381 - Forks: 86

egorsmkv/marblenet-inference
Inference code for Frame MarbleNet (VAD from NeMo)
Language: Python - Size: 57.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

itmo-mbss-lab/sr_labs_book
The project is related to the development of labs for the ITMO Speaker Recognition Course.
Language: Jupyter Notebook - Size: 3.25 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 10 - Forks: 8

sepnic/litevad
Speech-end detection library, based on WebRTC's VAD engine
Language: C - Size: 453 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 22 - Forks: 5

krithicswaroopan/AI-Voice-Assistance-Pipeline
A real-time voice-to-text and text-to-speech AI pipeline using Whisper, an LLM, and Edge-TTS with tunable parameters for low-latency audio processing and response generation.
Language: Python - Size: 80.2 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 4 - Forks: 1

RicherMans/GPV
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
Language: Python - Size: 8.85 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 142 - Forks: 29

ina-foss/InaGVAD
Voice activity detection and speaker gender segmentation audiovisual corpus
Language: Jupyter Notebook - Size: 1.4 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 13 - Forks: 1

Ave-Sergeev/Dictator
Speech-to-Text translation service (Rust, Tonic) (2025)
Language: Rust - Size: 49.3 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

spokestack/spokestack-android 📦
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Language: Java - Size: 1.25 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 74 - Forks: 10

sudydtdtgxdjchdyfghxyfgjcj/MLP-From-Scratch
A C++ implementation of a Multilayer Perceptron (MLP) neural network using Eigen, supporting multiple activation and loss functions, mini-batch gradient descent, and backpropagation for training.
Language: C++ - Size: 23.4 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

filippogiruzzi/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
Language: Python - Size: 238 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 363 - Forks: 69

nicklashansen/voice-activity-detection
Voice Activity Detection (VAD) using deep learning.
Language: Jupyter Notebook - Size: 2.41 MB - Last synced at: 4 months ago - Pushed at: almost 6 years ago - Stars: 196 - Forks: 33

nosoy77/logitech_bcc950
A talking eyeball on a stick - Logitech BCC950 PTZ camera control scripts
Language: Python - Size: 14.6 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

xaionaro-go/audio
A package for Go to playback, record and process audio
Language: Go - Size: 168 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

mvalancy-mt/logitech_bcc950
A talking eyeball on a stick - Logitech BCC950 PTZ camera control scripts
Language: Python - Size: 15.6 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

KarthikaRajagopal44/Beyond-Voice-Activity-Detection-Advanced-Turn-End-Prediction-in-Conversational-Agents
Moving Past VAD: Smarter Turn-Taking in Voice Assistants
Language: Python - Size: 1.95 KB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

webyneter/speech-to-console
A voice-controlled tool that converts spoken commands to text in your terminal
Language: Python - Size: 267 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language: MATLAB - Size: 261 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 854 - Forks: 234

BingLingGroup/autosub Fork of iWangJiaxiang/autosub
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
Language: Python - Size: 1.29 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,986 - Forks: 245

eesungkim/Voice_Activity_Detector
A statistical model-based Voice Activity Detection
Language: Jupyter Notebook - Size: 168 KB - Last synced at: 4 months ago - Pushed at: almost 7 years ago - Stars: 192 - Forks: 41

rohanprichard/fastrtc-demo
A simple POC of FastRTC, a framework to use voice mode in python!
Language: TypeScript - Size: 89.8 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 24 - Forks: 9

DictationDaddy/VAD_WEB_DEMO
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
Language: JavaScript - Size: 2 MB - Last synced at: 5 months ago - Pushed at: 8 months ago - Stars: 20 - Forks: 1

coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Size: 139 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,318 - Forks: 142

jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Size: 136 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,875 - Forks: 237

jsvir/vad
[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection
Language: Python - Size: 1.71 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 26 - Forks: 3

bunyaminergen/Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
Language: Python - Size: 23.9 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 65 - Forks: 10

numq/voice-activity-detection
JVM library for voice activity detection written in Kotlin based on C library fvad and Silero
Language: Kotlin - Size: 2.94 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

oadultradeepfield/healthhack-vad
A VAD service analyzing factors linked to cognitive decline, developed for the HealthHack 2025 project.
Size: 0 Bytes - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ganlvtech/bing-stt
Rust implementation of bing "Search using voice" button speech recognition API (similar to Azure real-time speech to text API)
Language: Rust - Size: 17.6 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 1

alexnaughtonjr/Real-Time-Voice-Cloning Fork of CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Size: 352 MB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 7 - Forks: 0

ElmiraGhorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 0

panmasuo/voice-activity-detection
Voice activity detection algorithm written in C
Language: C - Size: 43.9 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 3

itmo-mbss-lab/sr_lectures_book
The project is related to the development of Basics of Voice Biometrics lecture book for the ITMO Speaker Recognition Course.
Language: TeX - Size: 1.28 MB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

bannawandoor27/Teletrix
Smart audio mixer that automatically mutes test tones when you speak - perfect for audio testing and development.
Language: Go - Size: 2.36 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

jim-schwoebel/pauses
🎤 quick library to extract pause lengths from audio files.
Language: Python - Size: 2.13 MB - Last synced at: 5 months ago - Pushed at: over 6 years ago - Stars: 31 - Forks: 7

nyumaya/libnyumaya_esp32
Experimental support for nyumaya audio recognition on ESP32
Language: C++ - Size: 5.35 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

HolgerBovbjerg/SSL-PVAD
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS"
Language: Python - Size: 5.3 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

kristofferv98/SemanthaVoiceAssistant
A comprehensive AI companion leveraging advanced semantic analysis, sentiment detection, and voice processing to provide personalized and context-aware interactions using Autogen, semantic-router, and VoiceProcessingToolkit.
Language: Python - Size: 85 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

NickWilkinson37/voxseg
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
Language: Python - Size: 98.1 MB - Last synced at: 9 months ago - Pushed at: almost 3 years ago - Stars: 83 - Forks: 12

sooftware/End-to-End-Speech-Recognition-Models
PyTorch implementation of automatic speech recognition models.
Language: Python - Size: 84 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 38 - Forks: 5

mechanicalsea/spectra
Spectra extraction tutorials based on torch and torchaudio.
Language: Jupyter Notebook - Size: 3.31 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 40 - Forks: 4

IntendedConsequence/vadc
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
Language: C++ - Size: 8.45 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

developerdaya/MyFirstJarvisApp
Experience the power of a personal assistant with MyFirstJarvisApp! Perform tasks, get information, and manage your daily life seamlessly with voice commands. Perfect for anyone looking to simplify their routine and harness AI technology.
Language: Kotlin - Size: 24.7 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

Yifei-ZHAO96/Tr-VAD
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
Language: Python - Size: 5.55 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

binglel/vad_lrt_hmm
A statistical model-based Voice Activity Detector
Language: Python - Size: 725 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 1
