Topic: "speech-to-text"
louiskirsch/speechT
An opensource speech-to-text software written in tensorflow
Language: Python - Size: 524 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 155 - Forks: 36

albirrkarim/react-speech-highlight-demo
React / Vanilla JS Text to Speech with highlighting the words and sentences that are being spoken using audio files, text to speech API, and web speech synthesis API
Language: JavaScript - Size: 129 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 143 - Forks: 10

MycroftAI/ZZZ-RETIRED__openstt 📦
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
Size: 26.4 KB - Last synced at: about 10 hours ago - Pushed at: about 9 years ago - Stars: 142 - Forks: 11

spokestack/spokestack-python 📦
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.
Language: Python - Size: 6.7 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 139 - Forks: 14

coqui-ai/STT-models
Open models for Coqui STT
Size: 315 KB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 138 - Forks: 43

davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
Language: Python - Size: 5.01 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 135 - Forks: 14

paulovcmedeiros/pyRobBot
Chat with GPT LLMs over voice, UI & terminal, all with access to the internet. Powered by OpenAI.
Language: Python - Size: 1020 KB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 78

tugstugi/mongolian-speech-recognition
Mongolian speech recognition with PyTorch
Language: Python - Size: 164 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 134 - Forks: 52

ChetanXpro/nodejs-whisper
NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.
Language: TypeScript - Size: 729 KB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 132 - Forks: 35

pannous/angle
⦠ Angle: new speakable syntax for python 💡
Language: Python - Size: 2.07 MB - Last synced at: about 12 hours ago - Pushed at: about 1 year ago - Stars: 131 - Forks: 5

philipperemy/tensorflow-ctc-speech-recognition
Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Language: Python - Size: 634 KB - Last synced at: 13 days ago - Pushed at: about 4 years ago - Stars: 130 - Forks: 46

at16k/at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Language: Python - Size: 268 KB - Last synced at: 7 days ago - Pushed at: about 4 years ago - Stars: 129 - Forks: 18

khanld/ASR-Wav2vec-Finetune
:zap: Finetune Wa2vec 2.0 For Speech Recognition
Language: Python - Size: 5.1 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 127 - Forks: 28

smalltong02/keras-llm-robot
A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.
Language: Python - Size: 95.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 127 - Forks: 16

shakedzy/companion
Generative-AI-Powered Foreign-Language Private Tutor
Language: Python - Size: 9.86 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 125 - Forks: 28

cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
Language: Python - Size: 1.27 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 123 - Forks: 26

jackaduma/LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
Language: Python - Size: 448 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 123 - Forks: 17

silversparro/wav2letter.pytorch
A fully convolution-network for speech-to-text, built on pytorch.
Language: Python - Size: 105 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 123 - Forks: 23

moderato-app/talk 📦
Talk with ChatGPT using your VOICE
Language: Go - Size: 5.58 MB - Last synced at: 22 days ago - Pushed at: 8 months ago - Stars: 122 - Forks: 16

NICEElevateAI/ElevateAIJavaSDK
Java SDK for ElevateAI
Language: Java - Size: 67.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 121 - Forks: 0

gustavostz/whisper-clip
WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.
Language: Python - Size: 2.53 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 119 - Forks: 11

rioharper/VocalForge
Your one-stop solution for voice dataset creation
Language: Python - Size: 45.8 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 119 - Forks: 20

snakers4/russian_stt_text_normalization 📦
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
Language: Python - Size: 3.03 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 116 - Forks: 15

NICEElevateAI/ElevateAIDotNetSDK
.Net core 6 SDK for ElevateAI
Language: C# - Size: 934 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 115 - Forks: 0

bits-by-brandon/whisper-ui
A GUI interface for Open AI Whisper based on Tauri and Sveltekit
Language: Svelte - Size: 24.9 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 113 - Forks: 10

NICEElevateAI/ElevateAIPythonSDK
ElevateAI - Speech-to-text API Python SDK
Language: Python - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 111 - Forks: 0

gpustack/vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.
Language: Python - Size: 644 KB - Last synced at: about 21 hours ago - Pushed at: about 22 hours ago - Stars: 110 - Forks: 12

by2101/OpenASR
A pytorch based end2end speech recognition system.
Language: Python - Size: 2.21 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 110 - Forks: 23

embium/solverecaptchas
An async Python library to automate solving ReCAPTCHA v2 using Playwright.
Language: Python - Size: 145 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 109 - Forks: 25

themanyone/voice_typing
State-of-the-art offline voice typing everywhere + txt terminals (Linux or WFL sesson on Windows.) with a simple bash script. Usable with X. Does not require X.
Language: Shell - Size: 41 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 104 - Forks: 9

ferrinweb/voice-input-button2
New version of voice input button using new interface of iflytek voice dictation (the stream version). 基于讯飞新版语音听写(流式版) api 的语音输入按钮 vue 组件。
Language: JavaScript - Size: 3.89 MB - Last synced at: 4 days ago - Pushed at: 16 days ago - Stars: 104 - Forks: 16

mutablelogic/go-whisper
Speech-to-Text in golang
Language: Go - Size: 8.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 103 - Forks: 10

mmpneo/simple-obs-stt
Speech-to-text and keyboard input captions for OBS.
Language: TypeScript - Size: 23.6 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 103 - Forks: 6

dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
Language: Python - Size: 289 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 102 - Forks: 46

weespin/WillFromAfarDownloader 📦
acapellabox pwned.
Language: C# - Size: 4.77 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 102 - Forks: 18

daanzu/deepspeech-websocket-server
Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Language: Python - Size: 36.1 KB - Last synced at: 29 days ago - Pushed at: almost 5 years ago - Stars: 102 - Forks: 32

skshadan/TTS-RVC-API
Text to Speech using Coqui TTS + RVC
Language: Python - Size: 189 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 101 - Forks: 21

ancs21/awesome-openai-whisper
A curated list of awesome OpenAI's Whisper
Size: 48.8 KB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 101 - Forks: 4

aofdev/vue-pwa-speech
A Vue2 Performs synchronous speech recognition Speech to text Google Cloud Speech With Progressive Web App
Language: JavaScript - Size: 52.7 KB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 99 - Forks: 20

askrella/speech-rest-api
Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)
Language: Python - Size: 45.9 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 98 - Forks: 35

efeslab/LiteASR
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
Language: Python - Size: 1.19 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 97 - Forks: 4

alphacep/vosk-asterisk
Speech Recognition in Asterisk with Vosk Server
Language: C - Size: 41 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 97 - Forks: 38

skit-ai/kaldi-serve
Server framework for Kaldi ASR Toolkit
Language: C++ - Size: 18.7 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 97 - Forks: 24

OpenBMB/UltraEval-Audio
An easy-to-use, fast, and easily integrable tool for evaluating audio LLM
Language: Python - Size: 8.03 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 95 - Forks: 3

teamsudocode/dexter
Let your talking do the code
Language: JavaScript - Size: 358 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 95 - Forks: 19

simalexan/s3-lambda-transcribe-audio-to-text-s3
Transcribe your audio to text with this serverless component
Language: JavaScript - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 94 - Forks: 21

0ut0flin3/Talk2GPT
GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language. Includes a free text2image
Language: Python - Size: 420 KB - Last synced at: about 15 hours ago - Pushed at: about 2 years ago - Stars: 93 - Forks: 7

pavelzbornik/whisperX-FastAPI
FastAPI service on top of WhisperX
Language: Python - Size: 39.5 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 92 - Forks: 25

botbahlul/crx-live-translate
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!
Language: JavaScript - Size: 552 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 92 - Forks: 11

supershaneski/openai-whisper-talk
openai-whisper-talk is a sample voice conversation application powered by OpenAI technologies such as Whisper, Completions, Embeddings, and the latest Text-to-Speech. The application is built using Nuxt, a Javascript framework based on Vue.js.
Language: JavaScript - Size: 601 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 92 - Forks: 28

Azure-Samples/Cognitive-Services-Voice-Assistant
Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription
Language: C++ - Size: 76.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 92 - Forks: 99

EddyVerbruggen/nativescript-speech-recognition
:speech_balloon: Speech to text, using the awesome engines readily available on the device.
Language: TypeScript - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 92 - Forks: 24

the-ethan-hunt/B.E.N.J.I.
B.E.N.J.I.- The Impossible Missions Force's digital assistant
Language: Python - Size: 30.2 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 90 - Forks: 94

lambda-tech-club/bragging-detector
自慢を検知する装置
Language: JavaScript - Size: 261 KB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 90 - Forks: 16

fcakyon/pywhisper 📦
openai/whisper + extra features
Language: Python - Size: 2.18 MB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 89 - Forks: 7

lihanghang/Deep-learning-And-Paper
【仅作为交流学习使用】机器智能--相关书目及经典论文包括AutoML、情感分类、语音识别、声纹识别、语音合成实验代码等
Language: Jupyter Notebook - Size: 564 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 89 - Forks: 40

WindQAQ/listen-attend-and-spell 📦
Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project utilizes input pipeline and estimator API of Tensorflow, which makes the training and evaluation truly end-to-end.
Language: Python - Size: 135 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 89 - Forks: 32

kurianbenoy/Indic-Subtitler
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.
Language: Jupyter Notebook - Size: 36.4 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 88 - Forks: 13

fdasilva59/Udacity-Natural-Language-Processing-Nanodegree
Tutorials and my solutions to the Udacity NLP Nanodegree
Language: HTML - Size: 61 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 88 - Forks: 64

smlum/scription
An editor for speech-to-text transcripts such as AWS Transcribe and Mozilla DeepSpeech
Language: JavaScript - Size: 2.74 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 87 - Forks: 27

bensonruan/Chrome-Web-Speech-API
Chrome Web Speech API
Language: JavaScript - Size: 223 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 87 - Forks: 30

smilexizheng/mobile-pc-control-server
mobile phone web remote control pc, Node.js backend and web application.基于Node.js的服务端和移动端网页应用,实现手机对电脑的快捷键控制和鼠标的操作,界面简洁,功能强大,操作便捷。
Language: TypeScript - Size: 32.5 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 86 - Forks: 12

patrickenfuego/Chapterize-Audiobooks
Split a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.
Language: Python - Size: 39 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 86 - Forks: 14

shhossain/BanglaSpeech2Text
BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.
Language: Python - Size: 1.85 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 84 - Forks: 12

yum-food/TaSTT
A free self-hosted STT for VRChat
Language: Python - Size: 121 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 84 - Forks: 4

viddotech/videoalchemy
VideoAlchemy is a toolkit expanding video processing capabilities, emphasizing FFmpeg and broader video technology applications.
Language: Go - Size: 90.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 83 - Forks: 8

ai-adv-lab/deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture
Language: Python - Size: 271 KB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 83 - Forks: 33

LearnedVector/Wav2Letter
Speech Recognition model based off of FAIR research paper built using Pytorch.
Language: Python - Size: 186 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 81 - Forks: 24

ryanleary/patter
speech-to-text in pytorch
Language: Python - Size: 262 KB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 80 - Forks: 17

mozhou-tech/kim-voice-assistant
Kim,your personal voice kit for Home Inteligence.
Language: Python - Size: 9.31 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 79 - Forks: 17

jonaskahn/asktube
AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖. Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more.
Language: Python - Size: 7.62 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 78 - Forks: 23

JonathanFly/faster-whisper-livestream-translator
faster-whisper livestream translation, OBS noise reduction, dual language subtitles
Language: Python - Size: 18.6 KB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 78 - Forks: 7

thevasudevgupta/gsoc-wav2vec2
GSoC'2021 | TensorFlow implementation of Wav2Vec2
Language: Jupyter Notebook - Size: 6.67 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 78 - Forks: 29

AlexxIT/FasterWhisper
Faster Whisper for Home Assistant - custom integration with a local Speech-to-Text engine
Language: Python - Size: 4.88 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 77 - Forks: 6

IBM/MAX-Speech-to-Text-Converter
Converts spoken words into text form.
Language: Python - Size: 1.71 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 76 - Forks: 32

scripty-bot/scripty
Speech to text bot for Discord
Language: Rust - Size: 2.77 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 75 - Forks: 7

QuantiusBenignus/BlahST
Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs.
Language: Shell - Size: 1.05 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 75 - Forks: 6

fewieden/MMM-voice
Offline Voice Recognition Module for MagicMirror²
Language: JavaScript - Size: 456 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 74 - Forks: 27

mgonzs13/whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
Language: C++ - Size: 1.91 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 73 - Forks: 17

QuantiusBenignus/blurt
Gnome shell extension for accurate OFFLINE speech to text input in Linux using whisper.cpp. Input text from speech anywhere.
Language: JavaScript - Size: 1.26 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 73 - Forks: 7

j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
Language: Kotlin - Size: 3.27 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 73 - Forks: 7

innovatorved/whisper-openai-gradio-implementation
Whisper is an automatic speech recognition (ASR) system Gradio Web UI Implementation
Language: Python - Size: 134 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 73 - Forks: 14

aofdev/vue-speech-streaming
A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech
Language: JavaScript - Size: 50.8 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 72 - Forks: 19

VidyasagarMSC/WatBot
An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.
Language: Java - Size: 4.82 MB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 72 - Forks: 53

speechmatics/speechmatics-python
Python library and CLI for Speechmatics
Language: Python - Size: 2.93 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 71 - Forks: 20

keenresearch/keenasr-ios-poc
Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE HIRING: https://keenresearch.com/careers.html
Language: Objective-C - Size: 199 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 71 - Forks: 36

GoogleCloudPlatform/dataflow-contact-center-speech-analysis
Speech Analysis Framework, a collection of components and code from Google Cloud that you can use to transcribe audio files to create analytics.
Language: Python - Size: 177 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 40

nl8590687/ASRT_SDK_WinClient
An Windows client SDK and Demo software for ASRT speech recognition system. 一个可用于ASRT语音识别系统的Windows SDK和Demo客户端软件
Language: C# - Size: 85.9 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 71 - Forks: 28

inevolin/DiscordEarsBot
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.
Language: JavaScript - Size: 38.6 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 70 - Forks: 350

EtienneAb3d/karaok-AI
Karaoke Player / Editor with automatic clip creation from any song file using vocals and lyrics extraction (Speech-to-Text)
Language: Java - Size: 23.4 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 70 - Forks: 1

nguyennpa412/vietnamese-speech-to-text-wavenet
Vietnamese speech recognition using Wavenet
Language: Python - Size: 52.9 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 69 - Forks: 36

yh1008/speech-to-text
mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
Language: Jupyter Notebook - Size: 989 MB - Last synced at: 10 days ago - Pushed at: over 7 years ago - Stars: 69 - Forks: 19

compulim/web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services for both speech-to-text and text-to-speech service.
Language: JavaScript - Size: 58.7 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 68 - Forks: 19

MingLunHan/CIF-PyTorch
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).
Language: Python - Size: 106 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 68 - Forks: 6

aniemore/Aniemore
Emotions recognition from audio and text files (only russian language)
Language: Python - Size: 2.1 MB - Last synced at: 10 days ago - Pushed at: 9 months ago - Stars: 68 - Forks: 8

jackwuwei/gptspeaker
The ChatGPT/DeepSeek Voice Assistant uses a Raspberry Pi (or desktop) to enable spoken conversation with OpenAI or DeepSeek large language models. This implementation listens to speech, processes the conversation through the OpenAI/DeepSeek service, and responds back. Like Apple Siri, Amazon Alex, Google Nest Home, Mi XiaoAi etc.
Language: Python - Size: 10.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 67 - Forks: 9

aladinyo/ChatPlus
ChatPlus is a progressive web app developped with React, NodeJS, Firebase and other services
Language: JavaScript - Size: 15.8 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 67 - Forks: 6

alphacep/vosk-unity-asr
Automatic Speech Recognition in Unity using Vosk library
Language: C# - Size: 62.2 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 66 - Forks: 16

morioka/tiny-openai-whisper-api
OpenAI Whisper API-style local server, runnig on FastAPI
Language: Python - Size: 46.9 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 66 - Forks: 15

HeyHeyChicken/NOVA-NodeJS
NOVA is a customizable voice assistant made with Node.js.
Language: JavaScript - Size: 8.3 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 66 - Forks: 13
