GitHub topics: audio-processing-with-python
mwasifanwar/VoicePrint-ID
Advanced speaker identification and verification system using deep learning. Features emotion recognition, language detection, and anti-spoofing capabilities for secure voice authentication applications.
Language: Python - Size: 60.5 KB - Last synced at: about 8 hours ago - Pushed at: about 10 hours ago - Stars: 0 - Forks: 0
NotAbhinavGamerz/emotion-aware-automatic-speech-recognition
π€ Enhance speech recognition by detecting emotions in spoken language, combining OpenAI's Whisper and emotion analysis for deeper insights.
Language: Python - Size: 1.46 MB - Last synced at: about 17 hours ago - Pushed at: about 19 hours ago - Stars: 0 - Forks: 1
DevArqf/VoiceGuard
π‘οΈ Advanced Voice Authentication System using OpenAI ChatGPT & Whisper APIs. Secure voice biometric identification with AI-powered analysis, multi-sample enrollment, and enterprise-grade authentication logging. Python-based with SQLite database.
Language: Python - Size: 66.4 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0
krushna4141/VoiceGuard
π Enhance security with VoiceGuard, an AI-driven voice authentication system powered by OpenAIβs ChatGPT and Whisper for reliable voice identification.
Language: Python - Size: 38.1 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0
antarades/emotion-aware-automatic-speech-recognition
An intelligent speech recognition system that combines OpenAI's Whisper for accurate transcription with dual emotion detection models. Analyzes both audio characteristics (tone, pitch, intensity) and textual content to provide comprehensive emotional context alongside transcriptions.
Language: Python - Size: 199 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
Amin-moniry-pr7/Speech-to-Text-Transcription
This project automates audio processing by removing silence, transcribing speech to text, and storing the output in an SQLite database. It supports multiple audio formats and leverages Google Speech Recognition for high accuracy.
Language: Python - Size: 2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0
HealthTrack-app/custom_KWS
Custom KWS pipeline for single-speaker keyword spotting with CNN and MFCCs, 1s 16kHz audio, rich augmentations, and TensorFlow exports (.h5, .tflite) π±π»
Language: Python - Size: 313 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
moego0/custom_KWS
End-to-end pipeline for training a custom keyword detection model with TensorFlow & TFLite expor
Language: Python - Size: 86.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0
akspa0/The-Machine
Multimedia context generation tool using off-the-shelf components. Leverages several local ML/AI tools to accomplish transcription, context clues, and llm-driven tasks. Designed with extensibility in mind. Dataset preparation tool. Adds context to video and audio inputs.
Language: Python - Size: 2.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0
ap-atul/Audio-Denoising
Noise removal/ reducer from the audio file in python. De-noising is done using Wavelets and thresholding is done by VISU Shrink thresholding technique
Language: Python - Size: 26.4 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 190 - Forks: 21
JanWilczek/dspyplot
Convenience functions for commonly used digital signal processing plots.
Language: Python - Size: 409 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0
RubisetCie/spectrogram-converter Fork of muhdhuz/audio2spec
Scripts to convert audio files to spectrograms and back.
Language: Python - Size: 17.6 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0
glau-bd/audio-fingerprinting-fyp
FYP project of Gerald Lau, submitted to the Nanyang Technological University in partial fulfillment of the requirements for the Degree of Bachelor of Engineering (Computer Science). An application to embed links into the audio track of videos, using audio watermarking and audio fingerprinting technology.
Language: Vue - Size: 3.6 MB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 3
Thato-Mot/bird-species-classification-app
A sophisticated web application that identifies bird species from audio recordings using deep learning. Features multiple neural network models (MobileNetV2/VGG16), real-time audio visualization, and window-based analysis system. Built with Flask, TensorFlow, and Librosa.
Language: Python - Size: 179 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
Chandana-20/Conversation-Mixer-Tool
Conversation-Mixer-Tool: A Python utility to merge two audio files (caller and receiver) into a seamless, conversation-like output. Features include speech detection, bandpass filtering, noise reduction, and smooth audio transitions.
Language: Python - Size: 4.84 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0
dhiogoboza/audio-convolution
π :sound: | Audio convolution with python
Language: Python - Size: 14.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1
PavlosIsaris/music-file-transformer
A simple python script that recursively searches for files and transforms them to mp3, using ffmpeg.
Language: Python - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0
davidtkeane/Jervis-ChatGPT
This Python script is for a voice interface chatbot named Jervis. It uses OpenAI's GPT-3.5-turbo-instruct model to respond to user input. The chatbot responds by Elevenlabs Voices. Conversation are saved to MongoDB, and MP3 file local and can be emailed if needed.
Language: Python - Size: 928 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
migperfer/harmonic_compatibility
Repo of my Master Thesis in Pompeu Fabra University: Harmonic Compatibility for Loops in Electronic Music (demo website might take a little bit to load)
Language: Python - Size: 2.08 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 1
konradmaciejczyk/audio-signal-preprocessing-for-ml-classification-models
Language: Jupyter Notebook - Size: 4.67 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
makerportal/rpi_i2s
Raspberry Pi I2S Stereo Microphone Analyses in Python
Language: Python - Size: 985 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 15 - Forks: 5
rahuls98/audio-tagging
Audio tagging is the process of inferring descriptive labels from audio clips (Multi label classification task). This repository contains exploratory code/scripts for audio preprocessing and model fitting for the task of audio tagging and its applications.
Language: Jupyter Notebook - Size: 7 MB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
makerportal/quadmic
QuadMic Python Scripts for 4-microphone array audio analysis
Language: Python - Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
DebdutBiswas/audio-processing-with-python
e-yantra robotics competetion audio processing with python
Language: Python - Size: 5.32 MB - Last synced at: over 2 years ago - Pushed at: almost 9 years ago - Stars: 1 - Forks: 1