Topic: "voice-activity-detection"
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language: Python - Size: 100 MB - Last synced at: 2 days ago - Pushed at: 21 days ago - Stars: 11,014 - Forks: 1,110

noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
Language: Go - Size: 5.87 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 9,710 - Forks: 239

pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language: Jupyter Notebook - Size: 252 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 7,671 - Forks: 889

smacke/ffsubsync
Automagically synchronize subtitles with video.
Language: Python - Size: 3.7 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 7,188 - Forks: 295

snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Language: Python - Size: 100 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 6,032 - Forks: 574

BingLingGroup/autosub Fork of iWangJiaxiang/autosub
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
Language: Python - Size: 1.29 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1,986 - Forks: 245

jim-schwoebel/voice_datasets
๐ A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Size: 136 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1,875 - Forks: 237

k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
Language: C++ - Size: 2.07 MB - Last synced at: 28 days ago - Pushed at: 29 days ago - Stars: 1,330 - Forks: 180

juanmc2005/diart
A python package to build AI-powered real-time audio applications
Language: Python - Size: 34.8 MB - Last synced at: 2 days ago - Pushed at: 4 months ago - Stars: 1,325 - Forks: 103

coqui-ai/open-speech-corpora
๐ A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Size: 139 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 1,318 - Forks: 142

ggeop/Python-ai-assistant
Python AI assistant ๐ง
Language: Python - Size: 2.99 MB - Last synced at: 20 days ago - Pushed at: 7 months ago - Stars: 977 - Forks: 247

jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language: MATLAB - Size: 261 MB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 854 - Forks: 234

ina-foss/inaSpeechSegmenter
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Language: Python - Size: 36.6 MB - Last synced at: 27 days ago - Pushed at: 5 months ago - Stars: 802 - Forks: 138

amsehili/auditok
An audio/acoustic activity detection and audio segmentation tool
Language: Python - Size: 3.68 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 777 - Forks: 96

TEN-framework/ten-vad
TEN VAD: low-latency high-performance Voice Activity Detector
Language: C - Size: 9.59 MB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 508 - Forks: 42

baxtree/subaligner
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
Language: Python - Size: 103 MB - Last synced at: 24 days ago - Pushed at: about 2 months ago - Stars: 475 - Forks: 19

shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 25 days ago - Pushed at: 10 months ago - Stars: 416 - Forks: 54

gtreshchev/RuntimeAudioImporter ๐ฆ
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
Language: C++ - Size: 10.1 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 388 - Forks: 80

jim-schwoebel/voicebook
๐ฃ๏ธ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Language: Python - Size: 299 MB - Last synced at: 23 days ago - Pushed at: over 2 years ago - Stars: 381 - Forks: 86

filippogiruzzi/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
Language: Python - Size: 238 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 363 - Forks: 69

gkonovalov/android-vad
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Language: C - Size: 5.18 MB - Last synced at: 25 days ago - Pushed at: 5 months ago - Stars: 342 - Forks: 76

tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
Language: Python - Size: 169 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 220 - Forks: 15

Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning
Language: Python - Size: 43 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 217 - Forks: 15

nicklashansen/voice-activity-detection
Voice Activity Detection (VAD) using deep learning.
Language: Jupyter Notebook - Size: 2.41 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 196 - Forks: 33

eesungkim/Voice_Activity_Detector
A statistical model-based Voice Activity Detection
Language: Jupyter Notebook - Size: 168 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 192 - Forks: 41

RicherMans/GPV
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
Language: Python - Size: 8.85 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 142 - Forks: 29

zhenghuatan/rVADfast
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Language: Python - Size: 3.62 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 140 - Forks: 24

voithru/voice-activity-detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
Language: Python - Size: 11.3 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 125 - Forks: 21

zhenghuatan/rVAD
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Language: MATLAB - Size: 1.05 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 29

RicherMans/Datadriven-GPVAD
The codebase for Data-driven general-purpose voice activity detection.
Language: Python - Size: 20.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 84 - Forks: 21

NickWilkinson37/voxseg
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
Language: Python - Size: 98.1 MB - Last synced at: 6 months ago - Pushed at: almost 3 years ago - Stars: 83 - Forks: 12

Speech-Interaction-Technology-Aalto-U/itsp
Introduction to Speech Processing
Language: Jupyter Notebook - Size: 254 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 82 - Forks: 15

jim-schwoebel/voice_gender_detection
โ๏ธโ๏ธ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).
Language: Python - Size: 9.57 MB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 82 - Forks: 25

mgonzs13/whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
Language: C++ - Size: 1.92 MB - Last synced at: 15 days ago - Pushed at: 20 days ago - Stars: 76 - Forks: 18

Ankit-Kumar-Saini/Coursera_Deep_Learning_Specialization
Implementation of Logistic Regression, MLP, CNN, RNN & LSTM from scratch in python. Training of deep learning models for image classification, object detection, and sequence processing (including transformers implementation) in TensorFlow.
Language: Jupyter Notebook - Size: 208 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 74 - Forks: 54

spokestack/spokestack-android ๐ฆ
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS. Easily add voice to any Android app!
Language: Java - Size: 1.25 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 72 - Forks: 8

duj12/ASR-2Pass
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
Language: HTML - Size: 86.9 MB - Last synced at: 11 days ago - Pushed at: 3 months ago - Stars: 69 - Forks: 9

bunyaminergen/Callytics
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
Language: Python - Size: 23.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 65 - Forks: 10

spokestack/react-native-spokestack ๐ฆ
Spokestack: give your React Native app a voice interface!
Language: TypeScript - Size: 6.52 MB - Last synced at: 1 day ago - Pushed at: about 3 years ago - Stars: 61 - Forks: 13

gooofy/py-nltools
A collection of basic python modules for spoken natural language processing
Language: Python - Size: 413 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 57 - Forks: 15

bigcash/awesome-vad
A curated list of awesome voice activity detection
Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 50 - Forks: 2

spokestack/spokestack-ios ๐ฆ
Spokestack: give your iOS app a voice interface!
Language: Swift - Size: 9.94 MB - Last synced at: 10 days ago - Pushed at: almost 4 years ago - Stars: 43 - Forks: 8

mechanicalsea/spectra
Spectra extraction tutorials based on torch and torchaudio.
Language: Jupyter Notebook - Size: 3.31 MB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 40 - Forks: 4

sooftware/End-to-End-Speech-Recognition-Models
PyTorch implementation of automatic speech recognition models.
Language: Python - Size: 84 KB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 38 - Forks: 5

PiotrTa/Huawei-Challenge-Speaker-Identification
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
Language: Jupyter Notebook - Size: 33.3 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 36 - Forks: 10

jim-schwoebel/nala
๐ฆ Nala is an agile open-source voice assistant framework (20+ actions).
Language: Python - Size: 40.7 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 35 - Forks: 15

SEPIA-Framework/sepia-web-audio
Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, resampling and much more...
Language: JavaScript - Size: 9.69 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 35 - Forks: 3

jim-schwoebel/pauses
๐ค quick library to extract pause lengths from audio files.
Language: Python - Size: 2.13 MB - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 31 - Forks: 7

nianlonggu/WhisperSeg
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
Language: Python - Size: 243 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 29 - Forks: 9

jsvir/vad
[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection
Language: Python - Size: 1.71 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 26 - Forks: 3

usc-sail/mica-speech-activity-detection
Robust Speech Activity Detection (SAD) in movie audio
Language: Python - Size: 61.9 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 10

rohanprichard/fastrtc-demo
A simple POC of FastRTC, a framework to use voice mode in python!
Language: TypeScript - Size: 89.8 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 24 - Forks: 9

sepnic/litevad
Speech-end detection library, based on WebRTC's VAD engine
Language: C - Size: 453 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 22 - Forks: 5

bbc/bbc-speech-segmenter
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.
Language: Shell - Size: 62.6 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 22 - Forks: 2

DictationDaddy/VAD_WEB_DEMO
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
Language: JavaScript - Size: 2 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 20 - Forks: 1

idiap/zff_vad
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
Language: Python - Size: 631 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 1

pykeio/earshot
Ridiculously fast voice activity detection in pure #[no_std] Rust
Language: Rust - Size: 879 KB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 17 - Forks: 1

thurti/vad-audio-worklet
Voice Activity Detection (VAD) AudioWorklet
Language: JavaScript - Size: 762 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 5

xashru/robust-vad
Lightweight CNN for Robust Voice Activity Detection
Language: Python - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 1

CoEDL/vad-sli-asr
A pipeline to isolate and transcribe one language in mixed-language speech
Language: Python - Size: 350 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 3

baochuquan/ios-vad
iOS Voice Activity Detection (VAD). Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Language: Swift - Size: 4.5 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 15 - Forks: 0

daanzu/py-silero-vad-lite
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
Language: Python - Size: 1.9 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 14 - Forks: 1

sshh12/Conv-VAD
A packaged convolutional voice activity detector for noisy environments.
Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: about 6 years ago - Stars: 14 - Forks: 2

wahibhaq/android-speaker-audioanalysis
This is my Masters thesis project titled "Speaker Detection and Conversation Analysis on Mobile Devices".
Language: Java - Size: 125 MB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 14 - Forks: 16

ina-foss/InaGVAD
Voice activity detection and speaker gender segmentation audiovisual corpus
Language: Jupyter Notebook - Size: 1.4 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 13 - Forks: 1

panmasuo/voice-activity-detection
Voice activity detection algorithm written in C
Language: C - Size: 43.9 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 13 - Forks: 3

ZygoteCode/VadSharp
Enterprise VAD (Voice Activity Detection) in C#.NET (.NET 6.0+) with Microsoft.ML.Net, ONNXRuntime and DirectML. The easiest, efficient, and performant Silero VAD implementation! Always open for PRs.
Language: C# - Size: 354 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 12 - Forks: 1

ElmiraGhorbani/gpt-speaker-diarization
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 0

gooofy/py-vad-mh
Cython implementation of Moattar and Homayounpour's Voice Activity Detection (VAD) algorithm fast enough for real-time on an RPi 3.
Language: Python - Size: 73.2 KB - Last synced at: 2 months ago - Pushed at: almost 7 years ago - Stars: 12 - Forks: 2

egorsmkv/audio-katana
A tool to slice your audio files into chunks using the Voice Activity Detection technique
Language: Python - Size: 4.42 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 2

olami-developers/olami-android-hotword-detect-sdk
Hotword Detection (Wake Word Detection) Android library and sample codes
Size: 120 MB - Last synced at: 7 days ago - Pushed at: about 7 years ago - Stars: 11 - Forks: 2

itmo-mbss-lab/sr_labs_book
The project is related to the development of labs for the ITMO Speaker Recognition Course.
Language: Jupyter Notebook - Size: 3.25 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 10 - Forks: 8

dbklim/WebRTCVAD_Wrapper
A simple Python wrapper to simplify working with WebRTC VAD and its rougher analogue based on RMS and ZCR (useful for processing audio recordings before using them with neural networks).
Language: Python - Size: 625 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 9 - Forks: 3

pdadial/Speech_Emotion_Recognition_CNN-LSTM
CNN-LSTM based SER model using RAVDESS database
Language: Jupyter Notebook - Size: 202 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 2

pranshurastogi29/uis_rnn_for_speaker_diarization
speaker_diarization done on toy dataset and tested on timit dataset
Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

goepfert/audio_features
Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js
Language: JavaScript - Size: 197 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

MorrisXu-Driving/Speech-Augmentation-and-Endpoint-Detection
This repository is developed in MATLAB. Speech Augmentation is based on Adaptive Filtering while Endpoint Detection is based on Voice Activity Detection(VAD)
Language: MATLAB - Size: 3.3 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 3

menardai/chromecast_vad
RNN implementation of a voice activity detector to control Chromecast device volume.
Language: Jupyter Notebook - Size: 195 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 0

guozhonghao1994/Voice_Activity_Detection_V2
2018 Lenovo AI Lab Summer Intern
Language: Python - Size: 18.5 MB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 8 - Forks: 4

Mohamedhany99/Voice-Frequency-Extraction-Signal-Processing-
This Script is able to extract Frequency of the voice detected in an audio file (preferred in ".wav" filetype)
Language: Python - Size: 94.7 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

guozhonghao1994/Voice_Activity_Detection_V1
2018 Lenovo AI Lab Summer Intern
Language: C - Size: 46.9 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

alexnaughtonjr/Real-Time-Voice-Cloning Fork of CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Size: 352 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 0

Ave-Sergeev/Dictator
Speech-to-Text translation service (Rust, Tonic) (2025)
Language: Rust - Size: 49.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

HolgerBovbjerg/SSL-PVAD
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS"
Language: Python - Size: 5.3 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 6 - Forks: 0

PranavPutsa1006/Speaker-Diarization
Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python
Language: Jupyter Notebook - Size: 20.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

viswa5427/JARVIS-Personal_AI_Voice_Assistant
JARVIS-Personal_AI_Voice_Assistant
Language: Python - Size: 1.19 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 6 - Forks: 1

Saga9103/t2yLLM
A voice assistant with local LLM as a backend
Language: Python - Size: 342 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 0

kristofferv98/SemanthaVoiceAssistant
A comprehensive AI companion leveraging advanced semantic analysis, sentiment detection, and voice processing to provide personalized and context-aware interactions using Autogen, semantic-router, and VoiceProcessingToolkit.
Language: Python - Size: 85 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

AlexKly/Simple-Voice-Activity-Detector-using-MFCC-based-on-FPGA-Kintex
Voice Activity Detector based on MFCC features and DNN model
Language: VHDL - Size: 132 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 3

derrick56007/getsub
Download and sync subtitles automatically using Voice Activity Detection
Language: Python - Size: 53.7 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

rimshasaeed/Voice-Activity-Detection
Voice Activity Detection in speech signals using short time energy and zero-crossings rate
Language: MATLAB - Size: 6 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

Kanium/flappinghead
A voice-activated puppet application
Language: Lua - Size: 2.24 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

BenjaminNechicattu/Image-Editing-Using-Voice-Commands
Image Editing is made easier by Voice Commands!
Language: Python - Size: 21 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 3

binglel/vad_lrt_hmm
A statistical model-based Voice Activity Detector
Language: Python - Size: 725 KB - Last synced at: 11 months ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

IIT-PAVIS/Voice-Activity-Detection
A Real-world dataset and a new method for voice activity detection
Size: 6.92 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

Asteriskx/Kisaragi
Kisaragi is TimeSignal Application. ๐ธ๐๐
Language: C# - Size: 835 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 5 - Forks: 0

aidayang/FunASR-OneClick
FunASRๅฎๆถ่ฏญ้ณ่ฏๅซ็๏ผ่ฏๅซ้บฆๅ ้ฃๅ็ต่ๅ ๆญๆพ็ๅฃฐ้ณ๏ผ็ต่่ฏญ้ณๆๅญ่ฝฏไปถ
Size: 22.5 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 4 - Forks: 0

krithicswaroopan/AI-Voice-Assistance-Pipeline
A real-time voice-to-text and text-to-speech AI pipeline using Whisper, an LLM, and Edge-TTS with tunable parameters for low-latency audio processing and response generation.
Language: Python - Size: 80.2 MB - Last synced at: 13 days ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

IntendedConsequence/vadc
Uses the excellent silero VAD with onnxruntime C api for fast detection of audio segments with speech
Language: C++ - Size: 8.45 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

helemanc/ambient-intelligence
Application for Disruptive Situations Detection in public transports through Speech Emotion Recognition.
Language: Jupyter Notebook - Size: 998 MB - Last synced at: 8 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1
