An open API service providing repository metadata for many open source software ecosystems.

Topic: "speech-to-text"

louiskirsch/speechT

An opensource speech-to-text software written in tensorflow

Language: Python - Size: 524 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 155 - Forks: 36

albirrkarim/react-speech-highlight-demo

React / Vanilla JS Text to Speech with highlighting the words and sentences that are being spoken using audio files, text to speech API, and web speech synthesis API

Language: JavaScript - Size: 129 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 143 - Forks: 10

MycroftAI/ZZZ-RETIRED__openstt 📦

RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:

Size: 26.4 KB - Last synced at: about 10 hours ago - Pushed at: about 9 years ago - Stars: 142 - Forks: 11

spokestack/spokestack-python 📦

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Language: Python - Size: 6.7 MB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 139 - Forks: 14

coqui-ai/STT-models

Open models for Coqui STT

Size: 315 KB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 138 - Forks: 43

davidmartinrius/speech-dataset-generator

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Language: Python - Size: 5.01 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 135 - Forks: 14

paulovcmedeiros/pyRobBot

Chat with GPT LLMs over voice, UI & terminal, all with access to the internet. Powered by OpenAI.

Language: Python - Size: 1020 KB - Last synced at: 24 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 78

tugstugi/mongolian-speech-recognition

Mongolian speech recognition with PyTorch

Language: Python - Size: 164 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 134 - Forks: 52

ChetanXpro/nodejs-whisper

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

Language: TypeScript - Size: 729 KB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 132 - Forks: 35

pannous/angle

⦠ Angle: new speakable syntax for python 💡

Language: Python - Size: 2.07 MB - Last synced at: about 12 hours ago - Pushed at: about 1 year ago - Stars: 131 - Forks: 5

philipperemy/tensorflow-ctc-speech-recognition

Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).

Language: Python - Size: 634 KB - Last synced at: 13 days ago - Pushed at: about 4 years ago - Stars: 130 - Forks: 46

at16k/at16k

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Language: Python - Size: 268 KB - Last synced at: 7 days ago - Pushed at: about 4 years ago - Stars: 129 - Forks: 18

khanld/ASR-Wav2vec-Finetune

:zap: Finetune Wa2vec 2.0 For Speech Recognition

Language: Python - Size: 5.1 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 127 - Forks: 28

smalltong02/keras-llm-robot

A web UI Project In order to learn the large language model. This project includes features such as chat, quantization, fine-tuning, prompt engineering templates, and multimodality.

Language: Python - Size: 95.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 127 - Forks: 16

shakedzy/companion

Generative-AI-Powered Foreign-Language Private Tutor

Language: Python - Size: 9.86 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 125 - Forks: 28

cvqluu/simple_diarizer

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Language: Python - Size: 1.27 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 123 - Forks: 26

jackaduma/LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

Language: Python - Size: 448 KB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 123 - Forks: 17

silversparro/wav2letter.pytorch

A fully convolution-network for speech-to-text, built on pytorch.

Language: Python - Size: 105 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 123 - Forks: 23

moderato-app/talk 📦

Talk with ChatGPT using your VOICE

Language: Go - Size: 5.58 MB - Last synced at: 22 days ago - Pushed at: 8 months ago - Stars: 122 - Forks: 16

NICEElevateAI/ElevateAIJavaSDK

Java SDK for ElevateAI

Language: Java - Size: 67.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 121 - Forks: 0

gustavostz/whisper-clip

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.

Language: Python - Size: 2.53 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 119 - Forks: 11

rioharper/VocalForge

Your one-stop solution for voice dataset creation

Language: Python - Size: 45.8 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 119 - Forks: 20

snakers4/russian_stt_text_normalization 📦

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks

Language: Python - Size: 3.03 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 116 - Forks: 15

NICEElevateAI/ElevateAIDotNetSDK

.Net core 6 SDK for ElevateAI

Language: C# - Size: 934 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 115 - Forks: 0

bits-by-brandon/whisper-ui

A GUI interface for Open AI Whisper based on Tauri and Sveltekit

Language: Svelte - Size: 24.9 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 113 - Forks: 10

NICEElevateAI/ElevateAIPythonSDK

ElevateAI - Speech-to-text API Python SDK

Language: Python - Size: 43.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 111 - Forks: 0

gpustack/vox-box

A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

Language: Python - Size: 644 KB - Last synced at: about 21 hours ago - Pushed at: about 22 hours ago - Stars: 110 - Forks: 12

by2101/OpenASR

A pytorch based end2end speech recognition system.

Language: Python - Size: 2.21 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 110 - Forks: 23

embium/solverecaptchas

An async Python library to automate solving ReCAPTCHA v2 using Playwright.

Language: Python - Size: 145 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 109 - Forks: 25

themanyone/voice_typing

State-of-the-art offline voice typing everywhere + txt terminals (Linux or WFL sesson on Windows.) with a simple bash script. Usable with X. Does not require X.

Language: Shell - Size: 41 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 104 - Forks: 9

ferrinweb/voice-input-button2

New version of voice input button using new interface of iflytek voice dictation (the stream version). 基于讯飞新版语音听写(流式版) api 的语音输入按钮 vue 组件。

Language: JavaScript - Size: 3.89 MB - Last synced at: 4 days ago - Pushed at: 16 days ago - Stars: 104 - Forks: 16

mutablelogic/go-whisper

Speech-to-Text in golang

Language: Go - Size: 8.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 103 - Forks: 10

mmpneo/simple-obs-stt

Speech-to-text and keyboard input captions for OBS.

Language: TypeScript - Size: 23.6 MB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 103 - Forks: 6

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

Language: Python - Size: 289 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 102 - Forks: 46

weespin/WillFromAfarDownloader 📦

acapellabox pwned.

Language: C# - Size: 4.77 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 102 - Forks: 18

daanzu/deepspeech-websocket-server

Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments

Language: Python - Size: 36.1 KB - Last synced at: 29 days ago - Pushed at: almost 5 years ago - Stars: 102 - Forks: 32

skshadan/TTS-RVC-API

Text to Speech using Coqui TTS + RVC

Language: Python - Size: 189 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 101 - Forks: 21

ancs21/awesome-openai-whisper

A curated list of awesome OpenAI's Whisper

Size: 48.8 KB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 101 - Forks: 4

aofdev/vue-pwa-speech

A Vue2 Performs synchronous speech recognition Speech to text Google Cloud Speech With Progressive Web App

Language: JavaScript - Size: 52.7 KB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 99 - Forks: 20

askrella/speech-rest-api

Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)

Language: Python - Size: 45.9 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 98 - Forks: 35

efeslab/LiteASR

LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation

Language: Python - Size: 1.19 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 97 - Forks: 4

alphacep/vosk-asterisk

Speech Recognition in Asterisk with Vosk Server

Language: C - Size: 41 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 97 - Forks: 38

skit-ai/kaldi-serve

Server framework for Kaldi ASR Toolkit

Language: C++ - Size: 18.7 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 97 - Forks: 24

OpenBMB/UltraEval-Audio

An easy-to-use, fast, and easily integrable tool for evaluating audio LLM

Language: Python - Size: 8.03 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 95 - Forks: 3

teamsudocode/dexter

Let your talking do the code

Language: JavaScript - Size: 358 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 95 - Forks: 19

simalexan/s3-lambda-transcribe-audio-to-text-s3

Transcribe your audio to text with this serverless component

Language: JavaScript - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 94 - Forks: 21

0ut0flin3/Talk2GPT

GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language. Includes a free text2image

Language: Python - Size: 420 KB - Last synced at: about 15 hours ago - Pushed at: about 2 years ago - Stars: 93 - Forks: 7

pavelzbornik/whisperX-FastAPI

FastAPI service on top of WhisperX

Language: Python - Size: 39.5 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 92 - Forks: 25

botbahlul/crx-live-translate

Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!

Language: JavaScript - Size: 552 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 92 - Forks: 11

supershaneski/openai-whisper-talk

openai-whisper-talk is a sample voice conversation application powered by OpenAI technologies such as Whisper, Completions, Embeddings, and the latest Text-to-Speech. The application is built using Nuxt, a Javascript framework based on Vue.js.

Language: JavaScript - Size: 601 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 92 - Forks: 28

Azure-Samples/Cognitive-Services-Voice-Assistant

Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription

Language: C++ - Size: 76.3 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 92 - Forks: 99

EddyVerbruggen/nativescript-speech-recognition

:speech_balloon: Speech to text, using the awesome engines readily available on the device.

Language: TypeScript - Size: 2.17 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 92 - Forks: 24

the-ethan-hunt/B.E.N.J.I.

B.E.N.J.I.- The Impossible Missions Force's digital assistant

Language: Python - Size: 30.2 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 90 - Forks: 94

lambda-tech-club/bragging-detector

自慢を検知する装置

Language: JavaScript - Size: 261 KB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 90 - Forks: 16

fcakyon/pywhisper 📦

openai/whisper + extra features

Language: Python - Size: 2.18 MB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 89 - Forks: 7

lihanghang/Deep-learning-And-Paper

【仅作为交流学习使用】机器智能--相关书目及经典论文包括AutoML、情感分类、语音识别、声纹识别、语音合成实验代码等

Language: Jupyter Notebook - Size: 564 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 89 - Forks: 40

WindQAQ/listen-attend-and-spell 📦

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project utilizes input pipeline and estimator API of Tensorflow, which makes the training and evaluation truly end-to-end.

Language: Python - Size: 135 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 89 - Forks: 32

kurianbenoy/Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

Language: Jupyter Notebook - Size: 36.4 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 88 - Forks: 13

fdasilva59/Udacity-Natural-Language-Processing-Nanodegree

Tutorials and my solutions to the Udacity NLP Nanodegree

Language: HTML - Size: 61 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 88 - Forks: 64

smlum/scription

An editor for speech-to-text transcripts such as AWS Transcribe and Mozilla DeepSpeech

Language: JavaScript - Size: 2.74 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 87 - Forks: 27

bensonruan/Chrome-Web-Speech-API

Chrome Web Speech API

Language: JavaScript - Size: 223 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 87 - Forks: 30

smilexizheng/mobile-pc-control-server

mobile phone web remote control pc, Node.js backend and web application.基于Node.js的服务端和移动端网页应用,实现手机对电脑的快捷键控制和鼠标的操作,界面简洁,功能强大,操作便捷。

Language: TypeScript - Size: 32.5 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 86 - Forks: 12

patrickenfuego/Chapterize-Audiobooks

Split a single, monolithic mp3 audiobook file into chapters using Machine Learning and ffmpeg.

Language: Python - Size: 39 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 86 - Forks: 14

shhossain/BanglaSpeech2Text

BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.

Language: Python - Size: 1.85 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 84 - Forks: 12

yum-food/TaSTT

A free self-hosted STT for VRChat

Language: Python - Size: 121 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 84 - Forks: 4

viddotech/videoalchemy

VideoAlchemy is a toolkit expanding video processing capabilities, emphasizing FFmpeg and broader video technology applications.

Language: Go - Size: 90.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 83 - Forks: 8

ai-adv-lab/deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

Language: Python - Size: 271 KB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 83 - Forks: 33

LearnedVector/Wav2Letter

Speech Recognition model based off of FAIR research paper built using Pytorch.

Language: Python - Size: 186 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 81 - Forks: 24

ryanleary/patter

speech-to-text in pytorch

Language: Python - Size: 262 KB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 80 - Forks: 17

mozhou-tech/kim-voice-assistant

Kim,your personal voice kit for Home Inteligence.

Language: Python - Size: 9.31 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 79 - Forks: 17

jonaskahn/asktube

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖. Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more.

Language: Python - Size: 7.62 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 78 - Forks: 23

JonathanFly/faster-whisper-livestream-translator

faster-whisper livestream translation, OBS noise reduction, dual language subtitles

Language: Python - Size: 18.6 KB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 78 - Forks: 7

thevasudevgupta/gsoc-wav2vec2

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Language: Jupyter Notebook - Size: 6.67 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 78 - Forks: 29

AlexxIT/FasterWhisper

Faster Whisper for Home Assistant - custom integration with a local Speech-to-Text engine

Language: Python - Size: 4.88 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 77 - Forks: 6

IBM/MAX-Speech-to-Text-Converter

Converts spoken words into text form.

Language: Python - Size: 1.71 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 76 - Forks: 32

scripty-bot/scripty

Speech to text bot for Discord

Language: Rust - Size: 2.77 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 75 - Forks: 7

QuantiusBenignus/BlahST

Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs.

Language: Shell - Size: 1.05 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 75 - Forks: 6

fewieden/MMM-voice

Offline Voice Recognition Module for MagicMirror²

Language: JavaScript - Size: 456 KB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 74 - Forks: 27

mgonzs13/whisper_ros

Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2

Language: C++ - Size: 1.91 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 73 - Forks: 17

QuantiusBenignus/blurt

Gnome shell extension for accurate OFFLINE speech to text input in Linux using whisper.cpp. Input text from speech anywhere.

Language: JavaScript - Size: 1.26 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 73 - Forks: 7

j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

Language: Kotlin - Size: 3.27 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 73 - Forks: 7

innovatorved/whisper-openai-gradio-implementation

Whisper is an automatic speech recognition (ASR) system Gradio Web UI Implementation

Language: Python - Size: 134 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 73 - Forks: 14

aofdev/vue-speech-streaming

A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech

Language: JavaScript - Size: 50.8 KB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 72 - Forks: 19

VidyasagarMSC/WatBot

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

Language: Java - Size: 4.82 MB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 72 - Forks: 53

speechmatics/speechmatics-python

Python library and CLI for Speechmatics

Language: Python - Size: 2.93 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 71 - Forks: 20

keenresearch/keenasr-ios-poc

Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE HIRING: https://keenresearch.com/careers.html

Language: Objective-C - Size: 199 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 71 - Forks: 36

GoogleCloudPlatform/dataflow-contact-center-speech-analysis

Speech Analysis Framework, a collection of components and code from Google Cloud that you can use to transcribe audio files to create analytics.

Language: Python - Size: 177 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 71 - Forks: 40

nl8590687/ASRT_SDK_WinClient

An Windows client SDK and Demo software for ASRT speech recognition system. 一个可用于ASRT语音识别系统的Windows SDK和Demo客户端软件

Language: C# - Size: 85.9 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 71 - Forks: 28

inevolin/DiscordEarsBot

A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.

Language: JavaScript - Size: 38.6 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 70 - Forks: 350

EtienneAb3d/karaok-AI

Karaoke Player / Editor with automatic clip creation from any song file using vocals and lyrics extraction (Speech-to-Text)

Language: Java - Size: 23.4 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 70 - Forks: 1

nguyennpa412/vietnamese-speech-to-text-wavenet

Vietnamese speech recognition using Wavenet

Language: Python - Size: 52.9 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 69 - Forks: 36

yh1008/speech-to-text

mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras

Language: Jupyter Notebook - Size: 989 MB - Last synced at: 10 days ago - Pushed at: over 7 years ago - Stars: 69 - Forks: 19

compulim/web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services for both speech-to-text and text-to-speech service.

Language: JavaScript - Size: 58.7 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 68 - Forks: 19

MingLunHan/CIF-PyTorch

[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).

Language: Python - Size: 106 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 68 - Forks: 6

aniemore/Aniemore

Emotions recognition from audio and text files (only russian language)

Language: Python - Size: 2.1 MB - Last synced at: 10 days ago - Pushed at: 9 months ago - Stars: 68 - Forks: 8

jackwuwei/gptspeaker

The ChatGPT/DeepSeek Voice Assistant uses a Raspberry Pi (or desktop) to enable spoken conversation with OpenAI or DeepSeek large language models. This implementation listens to speech, processes the conversation through the OpenAI/DeepSeek service, and responds back. Like Apple Siri, Amazon Alex, Google Nest Home, Mi XiaoAi etc.

Language: Python - Size: 10.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 67 - Forks: 9

aladinyo/ChatPlus

ChatPlus is a progressive web app developped with React, NodeJS, Firebase and other services

Language: JavaScript - Size: 15.8 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 67 - Forks: 6

alphacep/vosk-unity-asr

Automatic Speech Recognition in Unity using Vosk library

Language: C# - Size: 62.2 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 66 - Forks: 16

morioka/tiny-openai-whisper-api

OpenAI Whisper API-style local server, runnig on FastAPI

Language: Python - Size: 46.9 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 66 - Forks: 15

HeyHeyChicken/NOVA-NodeJS

NOVA is a customizable voice assistant made with Node.js.

Language: JavaScript - Size: 8.3 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 66 - Forks: 13