Topic: "speech-to-text"
Picovoice/leopard
On-device speech-to-text engine powered by deep learning
Language: Python - Size: 419 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 454 - Forks: 28

edenai/edenai-apis
Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines
Language: Python - Size: 158 MB - Last synced at: about 10 hours ago - Pushed at: about 11 hours ago - Stars: 448 - Forks: 67

jonatasgrosman/huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Language: Python - Size: 598 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 447 - Forks: 45

AdolfVonKleist/Phonetisaurus
Phonetisaurus G2P
Language: Shell - Size: 2.24 MB - Last synced at: 11 months ago - Pushed at: 12 months ago - Stars: 440 - Forks: 122

toverainc/willow-inference-server
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Language: Python - Size: 3.27 MB - Last synced at: 6 days ago - Pushed at: 12 months ago - Stars: 440 - Forks: 47

ccoreilly/vosk-browser
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
Language: JavaScript - Size: 707 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 423 - Forks: 66

OpenNewsLabs/autoEdit_2
Fast text based video editing, node Electron Os X desktop app, with Backbone front end.
Language: JavaScript - Size: 111 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 421 - Forks: 56

double22a/speech_dataset
The dataset of Speech Recognition
Size: 74.2 KB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 413 - Forks: 77

DeutscheKI/tevr-asr-tool
State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.
Language: C - Size: 289 KB - Last synced at: 30 days ago - Pushed at: almost 3 years ago - Stars: 413 - Forks: 18

VolcanicArts/VRCOSC
Modular OSC program creator, toolkit, and router made for VRChat. Show your heartrate, time, hardware stats, speech to text, control Spotify, and more! Includes drag-and-drop prefabs for your avatar.
Language: C# - Size: 8.49 MB - Last synced at: 1 day ago - Pushed at: 22 days ago - Stars: 408 - Forks: 31

haydenbleasel/orate
The AI toolkit for speech.
Language: TypeScript - Size: 4.49 MB - Last synced at: 19 days ago - Pushed at: about 1 month ago - Stars: 407 - Forks: 23

ElishaAz/Sayboard
An open-source on-device voice IME (keyboard) for Android using the Vosk library.
Language: Kotlin - Size: 271 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 406 - Forks: 24

shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 406 - Forks: 49

rtzr/Awesome-Korean-Speech-Recognition
한국어 음성인식 STT API 리스트. 각 성능 벤치마크.
Size: 86.9 KB - Last synced at: 11 days ago - Pushed at: 19 days ago - Stars: 403 - Forks: 22

revdotcom/reverb
Open source inference code for Rev's model
Language: Python - Size: 507 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 399 - Forks: 25

modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language: Python - Size: 1.46 MB - Last synced at: 28 days ago - Pushed at: over 1 year ago - Stars: 396 - Forks: 33

ArthurFDLR/whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper
Language: Jupyter Notebook - Size: 124 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 393 - Forks: 114

bugbakery/transcribee
open source audio and video transcription software
Language: TypeScript - Size: 5.25 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 392 - Forks: 27

egorsmkv/speech-recognition-uk
🇺🇦 Speech Recognition & Synthesis for Ukrainian
Language: Python - Size: 2.42 MB - Last synced at: about 3 hours ago - Pushed at: about 4 hours ago - Stars: 385 - Forks: 22

Evil0ctal/Fast-Powerful-Whisper-AI-Services-API
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper API,使用本地运行的Whisper模型进行推理,并支持多GPU并发,针对分布式部署进行设计。还内置了包括TikTok、抖音等社交媒体平台的爬虫,可实现来自多个社交平台的无缝媒体处理,为媒体内容数据自动化处理提供了强大且可扩展的解决方案。
Language: Python - Size: 1.21 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 368 - Forks: 42

speechbrain/speechbrain.github.io
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Language: HTML - Size: 46.8 MB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 365 - Forks: 29

echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language: TypeScript - Size: 1.72 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 359 - Forks: 39

mailong25/self-supervised-speech-recognition
speech to text with self-supervised learning based on wav2vec 2.0 framework
Language: Python - Size: 13.1 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 359 - Forks: 113

alphacep/vosk
VOSK Speech Recognition Toolkit
Language: C - Size: 42 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 355 - Forks: 43

oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Language: Python - Size: 2.84 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 348 - Forks: 56

Nikorasu/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.
Language: Python - Size: 54.7 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 345 - Forks: 47

daanzu/kaldi-active-grammar
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Language: Python - Size: 579 KB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 343 - Forks: 51

Xewdy444/Playwright-reCAPTCHA
A Python library for solving reCAPTCHA v2 and v3 with Playwright
Language: Python - Size: 440 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 341 - Forks: 43

Carleslc/AudioToText
Transcribe and translate audio to text using Whisper and DeepL.
Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 323 - Forks: 44

SergeyShk/Speech-to-Text-Russian
Проект для распознавания речи на русском языке на основе pykaldi.
Language: Python - Size: 72.5 MB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 322 - Forks: 54

Renovamen/Speech-and-Text
Speech to text (PocketSphinx, Iflytex API, Baidu API) and text to speech (pyttsx3) | 语音转文字(PocketSphinx、百度 API、科大讯飞 API)和文字转语音(pyttsx3)
Language: Python - Size: 63.5 KB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 319 - Forks: 75

yohasebe/openai-chat-api-workflow
🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image generation/editing/understanding 🖼️, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈
Size: 113 MB - Last synced at: 2 days ago - Pushed at: 16 days ago - Stars: 315 - Forks: 9

hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Language: Python - Size: 4.17 MB - Last synced at: 6 months ago - Pushed at: over 7 years ago - Stars: 313 - Forks: 120

Adri6336/gpt-voice-conversation-chatbot
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
Language: Python - Size: 2.92 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 305 - Forks: 48

deepgram/deepgram-python-sdk
Official Python SDK for Deepgram.
Language: Python - Size: 16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 304 - Forks: 82

Migushthe2nd/MsEdgeTTS
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
Language: TypeScript - Size: 265 KB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 302 - Forks: 44

NsLearning/LangHelper
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
Language: Rust - Size: 31.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 297 - Forks: 21

gtreshchev/RuntimeSpeechRecognizer 📦
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
Language: C++ - Size: 24.8 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 292 - Forks: 45

b4rtaz/voice-assistant
Voice assistant for Visual Studio Code.
Language: TypeScript - Size: 723 KB - Last synced at: 30 days ago - Pushed at: almost 4 years ago - Stars: 292 - Forks: 11

theblackcat102/edgedict
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Language: Python - Size: 5.53 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 291 - Forks: 44

Kabanosk/whisper-website
Simple web application, which can be used to convert audio to subtitles by OpenAI's Whisper model
Language: Python - Size: 48.8 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 288 - Forks: 71

nanihadesuka/NovelDokusha
Android web novel reader
Language: Kotlin - Size: 10.4 MB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 284 - Forks: 19

Olney1/ChatGPT-OpenAI-Smart-Speaker
This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice and vision-driven conversations, with additional web search capabilities via OpenAI and Langchain agents.
Language: Python - Size: 145 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 281 - Forks: 31

Kaljurand/K6nele
An Android app that offers speech-to-text user interfaces to other apps
Language: Java - Size: 24.5 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 280 - Forks: 83

Thiagohgl/ai-pronunciation-trainer
This tool uses AI to evaluate your pronunciation.
Language: Python - Size: 2.04 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 276 - Forks: 77

haoheliu/voicefixer_main
General Speech Restoration
Language: Python - Size: 21.5 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 276 - Forks: 56

cyberofficial/Synthalingua
Synthalingua - Real Time Translation
Language: Python - Size: 1.41 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 274 - Forks: 19

NaomiProject/Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Language: Python - Size: 5.27 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 274 - Forks: 60

SamirPaulb/real-time-voice-translator
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
Language: Tcl - Size: 248 MB - Last synced at: about 3 hours ago - Pushed at: over 1 year ago - Stars: 274 - Forks: 72

jim60105/docker-whisperX
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
Language: Dockerfile - Size: 368 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 273 - Forks: 37

mmpneo/curses
Speech to Text and KB input captions for OBS, VRChat, Twitch chat and Discord
Language: TypeScript - Size: 2.32 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 23

algolia/voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI
Language: Kotlin - Size: 38.5 MB - Last synced at: 18 days ago - Pushed at: about 3 years ago - Stars: 258 - Forks: 36

asticode/go-astibob
Golang framework to build an AI that can understand and speak back to you, and everything else you want
Language: Go - Size: 1.54 MB - Last synced at: 6 months ago - Pushed at: over 5 years ago - Stars: 244 - Forks: 20

robmsmt/KerasDeepSpeech
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
Language: Python - Size: 150 MB - Last synced at: 3 days ago - Pushed at: about 7 years ago - Stars: 242 - Forks: 76

pythonlessons/mltu
Machine Learning Training Utilities (for TensorFlow and PyTorch)
Language: Python - Size: 1.98 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 235 - Forks: 134

MikeyParton/react-speech-kit
React hooks for Speech Recognition and Speech Synthesis
Language: JavaScript - Size: 983 KB - Last synced at: 10 months ago - Pushed at: almost 2 years ago - Stars: 232 - Forks: 63

nikdanilov/whisper-obsidian-plugin
Speech-to-text in Obsidian using OpenAI Whisper
Language: TypeScript - Size: 236 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 228 - Forks: 32

shenasa-ai/speech2text
A Deep-Learning-Based Persian Speech Recognition System
Language: Jupyter Notebook - Size: 21.9 MB - Last synced at: about 6 hours ago - Pushed at: almost 2 years ago - Stars: 224 - Forks: 30

rolczynski/Automatic-Speech-Recognition 📦
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
Language: Python - Size: 3.6 MB - Last synced at: 24 days ago - Pushed at: almost 5 years ago - Stars: 224 - Forks: 63

tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
Language: Python - Size: 169 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 220 - Forks: 15

rakeshvar/rnn_ctc
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
Language: Python - Size: 604 KB - Last synced at: 6 months ago - Pushed at: almost 9 years ago - Stars: 220 - Forks: 80

Kaljurand/dictate.js
A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.
Language: JavaScript - Size: 228 KB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 217 - Forks: 62

Picovoice/web-voice-processor
A library for real-time voice processing in web browsers
Language: TypeScript - Size: 2.59 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 215 - Forks: 22

MainRo/deepspeech-server
A testing server for a speech to text service based on coqui.ai
Language: Python - Size: 80.1 KB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 215 - Forks: 71

pszemraj/vid2cleantxt
Python API & command-line tool to easily transcribe speech-based video files into clean text
Language: Jupyter Notebook - Size: 723 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 212 - Forks: 29

alphacep/awesome-russian-speech
Russian speech technology links
Size: 120 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 209 - Forks: 14

robmsmt/ASR-Audio-Data-Links
A list of publically available audio data that anyone can download for ASR or other speech activities
Language: Shell - Size: 25.4 KB - Last synced at: 3 days ago - Pushed at: almost 4 years ago - Stars: 209 - Forks: 22

JosefAlbers/whisper-turbo-mlx
Blazing fast whisper turbo for ASR (speech-to-text) tasks
Language: Python - Size: 527 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 202 - Forks: 9

jmaczan/gdansk-ai
Full stack voice chatbot
Language: TypeScript - Size: 2.98 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 197 - Forks: 21

Kyubyong/expressive_tacotron
Tensorflow Implementation of Expressive Tacotron
Language: Python - Size: 5.8 MB - Last synced at: 19 days ago - Pushed at: over 6 years ago - Stars: 196 - Forks: 34

HenestrosaDev/audiotext
A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.
Language: Python - Size: 80.5 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 194 - Forks: 18

bricewalker/Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Language: Jupyter Notebook - Size: 2.88 GB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 192 - Forks: 40

ssheng/BentoChain
A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models from 🤗 Hugging Face, and 🍱 BentoML.
Language: Python - Size: 4.63 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 192 - Forks: 24

deepgram/deepgram-js-sdk
Official JavaScript SDK for Deepgram.
Language: TypeScript - Size: 22.4 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 189 - Forks: 68

jofizcd/Soul-of-Waifu
Discover the world of artificial intelligence and interact with your favorite characters without needing to learn tons of information. Bring your Waifu to life with Soul of Waifu!
Language: Python - Size: 26.8 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 188 - Forks: 15

balavenkatesh3322/audio-pretrained-model
A collection of Audio and Speech pre-trained models.
Size: 134 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 188 - Forks: 26

JigsawStack/insanely-fast-whisper-api
An API to transcribe audio with OpenAI's Whisper Large v3!
Language: Python - Size: 250 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 185 - Forks: 27

felixbade/transcribe
Web UI for OpenAI Whisper API
Language: HTML - Size: 29.3 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 184 - Forks: 28

Kyubyong/speaker_adapted_tts
Making a TTS model with 1 minute of speech samples within 10 minutes
Size: 5.86 KB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 184 - Forks: 17

deepgram-starters/nextjs-live-transcription
Get started using Deepgram's Live Transcription with this Next.js demo app
Language: TypeScript - Size: 265 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 182 - Forks: 218

alexruperez/SpeechRecognizerButton
UIButton subclass with push to talk recording, speech recognition and Siri-style waveform view.
Language: Swift - Size: 580 KB - Last synced at: 3 days ago - Pushed at: over 5 years ago - Stars: 182 - Forks: 35

maxent-ai/converse
Conversational text Analysis using various NLP techniques
Language: Jupyter Notebook - Size: 154 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 181 - Forks: 19

MacKey-255/GoodByeCatpcha Fork of mikeyy/nonoCAPTCHA
An asynchronized Python library to automate solving ReCAPTCHA v2 using audio and image recognition
Language: Python - Size: 117 MB - Last synced at: 28 days ago - Pushed at: almost 2 years ago - Stars: 180 - Forks: 56

asticode/go-astideepspeech
Golang bindings for Mozilla's DeepSpeech speech-to-text library
Language: Go - Size: 35.2 KB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 175 - Forks: 23

misyaguziya/VRCT
VRCT(VRChat Chatbox Translator & Transcription)
Language: JavaScript - Size: 58.8 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 174 - Forks: 18

sonngdev/chatgpt-voice
Have a conversation with ChatGPT. Casually 🔈 🤖 ⚡️
Language: TypeScript - Size: 872 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 173 - Forks: 41

lucoiso/UEAzSpeech
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
Language: C++ - Size: 161 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 172 - Forks: 39

smoke-trees/Voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
Language: Python - Size: 3.12 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 170 - Forks: 46

smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Language: Python - Size: 42 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 169 - Forks: 39

sovaai/sova-asr
SOVA ASR (Automatic Speech Recognition)
Language: Python - Size: 2.32 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 169 - Forks: 21

djmango/obsidian-transcription
Obsidian plugin to create high-quality transcriptions from markdown linked audio files
Language: TypeScript - Size: 13.5 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 167 - Forks: 20

Tadashi-Hikari/Sapphire
A free and open source replacement for Google Assistant on Android devices, meant to integrate with the Sapphire Framework. It contains both speech-to-text and text-to-speech services. It does not require Google services or network connectivity
Language: Kotlin - Size: 104 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 165 - Forks: 5

locaal-ai/obs-cleanstream
CleanStream is an OBS plugin that uses AI to clean live audio streams from unwanted words and utterances
Language: C++ - Size: 68.4 MB - Last synced at: about 22 hours ago - Pushed at: 5 months ago - Stars: 163 - Forks: 13

Kyubyong/tacotron_asr
Speech Recognition Using Tacotron
Language: Python - Size: 4.65 MB - Last synced at: 19 days ago - Pushed at: over 7 years ago - Stars: 163 - Forks: 39

lihanghang/CASR-DEMO
基于Flask Web的中文自动语音识别演示系统,包含语音识别、语音合成、声纹识别之说话人识别。
Language: CSS - Size: 97.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 161 - Forks: 28

AppDevGuy/OSSSpeechKit
OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.
Language: Swift - Size: 1.38 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 160 - Forks: 41

lee-b/kobold_assistant
Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper
Language: Python - Size: 859 KB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 158 - Forks: 14

chenmingxiang110/Chinese-automatic-speech-recognition
Chinese speech recognition
Language: Jupyter Notebook - Size: 1.58 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 158 - Forks: 62

sekwiatkowski/awesome-ai-services
An overview of the AI-as-a-service landscape
Language: Java - Size: 415 KB - Last synced at: 1 day ago - Pushed at: almost 7 years ago - Stars: 157 - Forks: 22

Pikurrot/whisper-gui
A simple GUI to use Whisper.
Language: Python - Size: 9.85 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 155 - Forks: 13
