Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: automatic-speech-recognition
chimechallenge/chime-utils
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
Language: Python - Size: 2.51 MB - Last synced: about 17 hours ago - Pushed: about 20 hours ago - Stars: 13 - Forks: 2
EricApgar/live-speech-to-text
Live speech to text transcription.
Language: Python - Size: 214 KB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 0 - Forks: 0
EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Size: 6.54 MB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 397 - Forks: 26
winstxnhdw/CapGen
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
Language: Python - Size: 546 KB - Last synced: about 21 hours ago - Pushed: 1 day ago - Stars: 1 - Forks: 0
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Size: 197 KB - Last synced: 1 day ago - Pushed: 7 months ago - Stars: 2,881 - Forks: 506
leduckhai/MultiMed
Multilingual Multitask Multipurpose Medical Speech Recognition
Language: Python - Size: 6.26 MB - Last synced: 1 day ago - Pushed: 2 days ago - Stars: 8 - Forks: 7
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Language: Python - Size: 89.3 MB - Last synced: about 19 hours ago - Pushed: 2 days ago - Stars: 903 - Forks: 244
bricewalker/Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Language: Jupyter Notebook - Size: 2.88 GB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 192 - Forks: 40
Srijith-rkr/KAUST-Whisper-Adapter
INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!
Language: Python - Size: 5.26 MB - Last synced: 3 days ago - Pushed: 8 months ago - Stars: 28 - Forks: 2
shirayu/whispering ๐ฆ
Streaming transcriber with whisper
Language: Python - Size: 288 KB - Last synced: 3 days ago - Pushed: about 1 year ago - Stars: 679 - Forks: 57
NavodPeiris/speechlib
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Language: Python - Size: 31.3 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 90 - Forks: 5
MooersLab/bash-whisper-transcription
Bash function to ease the transcription of audio files with OpenAI's whisper.
Language: Python - Size: 70.3 KB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 1 - Forks: 1
jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
Language: Python - Size: 762 KB - Last synced: 3 days ago - Pushed: 10 days ago - Stars: 543 - Forks: 89
matiuste/DistriBlock
[UAI 2024] DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution.
Size: 275 KB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 0 - Forks: 0
QubitPi/cmusphinx.github.io Fork of cmusphinx/cmusphinx.github.io
CMUSphinx Website
Language: HTML - Size: 17.7 MB - Last synced: 7 days ago - Pushed: 8 days ago - Stars: 0 - Forks: 0
George0828Zhang/torch_cif
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Language: Python - Size: 167 KB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 29 - Forks: 3
kakaobrain/pororo ๐ฆ
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Language: Python - Size: 12.8 MB - Last synced: 5 days ago - Pushed: about 2 years ago - Stars: 1,257 - Forks: 224
YoavRamon/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Size: 18.6 KB - Last synced: 3 days ago - Pushed: over 2 years ago - Stars: 531 - Forks: 85
th-schmidt/whisply
Transcribe, diarize, annotate and subtitle audio and video with Whisper ... fast!
Language: Python - Size: 116 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 2 - Forks: 1
LD239/WebTranscript
Interactive web tool for automatically โ๏ธ transcribing and subtitling videos from URL or file uploads in your chosen language. The transcript appears alongside the video player, complete with embedded subtitles.
Language: JavaScript - Size: 3.72 MB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 1 - Forks: 0
JarbasAl/pocketsphinx-models-mirror
pocketsphinx models for languages originating from the iberian peninsula
Size: 337 MB - Last synced: 7 days ago - Pushed: over 3 years ago - Stars: 8 - Forks: 4
sungnyun/ARMHuBERT
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
Language: Python - Size: 4.51 MB - Last synced: 3 days ago - Pushed: 21 days ago - Stars: 31 - Forks: 4
archiki/ASR-Accent-Analysis
Analysis and investigating the confounding effect of accents in end-to-end Automatic Speech Recognition models.
Language: Jupyter Notebook - Size: 9.84 MB - Last synced: 13 days ago - Pushed: almost 4 years ago - Stars: 14 - Forks: 5
coqui-ai/STT
๐ธSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Language: C++ - Size: 53.4 MB - Last synced: 14 days ago - Pushed: 2 months ago - Stars: 2,144 - Forks: 258
Picovoice/leopard
On-device speech-to-text engine powered by deep learning
Language: Python - Size: 418 MB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 408 - Forks: 23
prateekralhan/Automatic-End-to-End-Speech-Recognition-using-pytorch
ASR using Pytorch and huggingface transformers
Language: Python - Size: 886 KB - Last synced: 15 days ago - Pushed: almost 2 years ago - Stars: 3 - Forks: 1
ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
Language: Python - Size: 1.28 MB - Last synced: 15 days ago - Pushed: 20 days ago - Stars: 1,652 - Forks: 301
noco-ai/spellbook-docker
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
Language: Shell - Size: 2.39 MB - Last synced: 14 days ago - Pushed: 15 days ago - Stars: 105 - Forks: 5
jonatasgrosman/huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Language: Python - Size: 598 KB - Last synced: 15 days ago - Pushed: 8 months ago - Stars: 415 - Forks: 42
DevTae/SpeechFeedback
Docker, ์์ฑ์ธ์ AI, FastAPI ๊ธฐ๋ฐ ํ๊ตญ์ด ๋ฐ์ ๊ต์ ์์คํ
Language: HTML - Size: 5.68 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 7 - Forks: 1
Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
Language: Python - Size: 79.7 MB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 555 - Forks: 66
PyThaiNLP/pythaiasr
Python Thai Automatic Speech Recognition
Language: Python - Size: 178 KB - Last synced: 14 days ago - Pushed: about 1 year ago - Stars: 51 - Forks: 13
DeepTranscript/deeptranscript-demo
API integration examples
Language: JavaScript - Size: 2.03 MB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0
hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Language: Python - Size: 4.17 MB - Last synced: 18 days ago - Pushed: over 6 years ago - Stars: 312 - Forks: 123
ieasybooks/tafrigh
ุชูุฑูุบ ุงูู ูุงุฏ ุงูู ุฑุฆูุฉ ุฃู ุงูู ุณู ูุนุฉ ุฅูู ูุตูุต
Language: Python - Size: 133 KB - Last synced: 14 days ago - Pushed: 4 months ago - Stars: 83 - Forks: 9
fabio-sim/Fast-SeamlessM4T-ONNX ๐ฆ
ONNX-compatible Fast SeamlessM4TโMassively Multilingual & Multimodal Machine Translation
Language: Python - Size: 371 KB - Last synced: 19 days ago - Pushed: 9 months ago - Stars: 37 - Forks: 0
PeterGilles/Speech-Recognition-Lecture---Data-Science-in-Humanities
Material for my lecture on Automatic Speech Recognition
Language: Jupyter Notebook - Size: 26.4 MB - Last synced: 21 days ago - Pushed: 22 days ago - Stars: 0 - Forks: 0
lexust1/av2txtsum
Automatic speech recognition (ASR)
Language: HTML - Size: 694 KB - Last synced: 23 days ago - Pushed: 24 days ago - Stars: 0 - Forks: 0
ogunlao/asr_stat_significance
Performs statistical significance test between two ASR models using bootstrap or blockwise bootstrap sampling.
Language: Python - Size: 22.5 KB - Last synced: 3 days ago - Pushed: 7 months ago - Stars: 3 - Forks: 0
at16k/at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Language: Python - Size: 268 KB - Last synced: 26 days ago - Pushed: about 3 years ago - Stars: 130 - Forks: 19
Nexdata-AI/1796-Hours-German-Speech-Data-by-Mobile-Phone
German Speech Dataset
Size: 457 KB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 1 - Forks: 0
Nexdata-AI/201-Hours-North-American-English-Speech-Data-by-Mobile-Phone-and-PC
North American English Speech Dataset
Size: 3.91 KB - Last synced: 28 days ago - Pushed: 28 days ago - Stars: 1 - Forks: 0
Nexdata-AI/20.1-Hours-Chinese-Mandarin-Synthesis-Corpus-Male-Customer-Service
Chinese Mandarin Synthesis Corpus-Male
Size: 3.91 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 1 - Forks: 0
Nexdata-AI/26.1-Hours-Chinese-Mandarin-Synthesis-Corpus-Female-Customer-Service
Chinese Mandarin Synthesis Corpus-Customer Sevice
Size: 3.91 KB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 1 - Forks: 0
archiki/Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.
Language: Python - Size: 136 KB - Last synced: 13 days ago - Pushed: about 3 years ago - Stars: 44 - Forks: 10
undertheseanlp/automatic_speech_recognition
Vietnamese Automatic Speech Recognition
Language: Python - Size: 131 MB - Last synced: 30 days ago - Pushed: over 5 years ago - Stars: 61 - Forks: 37
mathusanm6/Amaze-Voice-Lab
The goal of this research project is to be able to control the movements of characters in a Maze game using real-time voice commands such as saying out loud Up, Down, Left or Right.
Language: Java - Size: 65.8 MB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
abus-aikorea/studio-free
youtube download, vocal remover, vocal extraction, karaoke video production, STT, automatic speech recognition, transcription, automatic subtitle, AI, yt-dlp, demucs, whisper, webui, gradio, windows
Language: Python - Size: 8.98 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 8 - Forks: 0
sovaai/sova-asr
SOVA ASR (Automatic Speech Recognition)
Language: Python - Size: 2.32 MB - Last synced: 19 days ago - Pushed: about 1 year ago - Stars: 167 - Forks: 19
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
Language: Python - Size: 365 KB - Last synced: 13 days ago - Pushed: 8 months ago - Stars: 63 - Forks: 6
CoEDL/elpis
๐ software for creating speech recognition models.
Language: Python - Size: 82.5 MB - Last synced: 28 days ago - Pushed: 8 months ago - Stars: 150 - Forks: 30
ArthurFDLR/whisper-youtube
๐ Youtube Videos Transcription with OpenAI's Whisper
Language: Jupyter Notebook - Size: 124 KB - Last synced: about 1 month ago - Pushed: 4 months ago - Stars: 312 - Forks: 101
snakers4/open_stt ๐ฆ
Open STT
Language: Python - Size: 87.9 KB - Last synced: about 1 month ago - Pushed: about 2 years ago - Stars: 763 - Forks: 80
ECNU-Cross-Innovation-Lab/ENT
[ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
Language: Python - Size: 638 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 7 - Forks: 0
jmaczan/asr-dysarthria
๐บ Research on Automatic Speech Recognition for dysarthric speech
Language: Jupyter Notebook - Size: 725 KB - Last synced: 23 days ago - Pushed: about 1 month ago - Stars: 2 - Forks: 0
inferless/Distil-whisper-large-v2
Distil-Whisper is a distilled version of the Whisper model that is 6 times faster, 49% smaller, and performs within 1% WER on out-of-distribution evaluation sets. This is the repository for distil-large-v2, a distilled variant of Whisper large-v2.
Language: Python - Size: 10.7 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0
kmario23/KenLM-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
Size: 5.86 KB - Last synced: 24 days ago - Pushed: almost 5 years ago - Stars: 110 - Forks: 21
koudounasalkis/Divergences-in-Apollo-Missions
This repo contains the code for "Houston we have a Divergence: A Subgroup Performance Analysis of ASR Models"
Language: Jupyter Notebook - Size: 602 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0
analyticsinmotion/werpy
๐๐ฆ Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.
Language: Python - Size: 415 KB - Last synced: 24 days ago - Pushed: 24 days ago - Stars: 9 - Forks: 2
MLLP-Research-Group/Europarl-ASR
A 1300-hour English speech and text corpus of parliamentary debates for streaming ASR training and benchmarking, speech data filtering and speech data verbatimization.
Size: 41 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 2 - Forks: 0
roboticslab-uc3m/speech
Text To Speech (TTS) and Automatic Speech Recognition (ASR).
Language: Python - Size: 71.8 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 3 - Forks: 4
Darveivoldavara/whisper_model_evaluator Fork of format37/vosk_model_evaluator
WER, MER, WIL of Whisper vs Vosk vs Google transcribators comparator
Size: 1.38 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0
inferless/Distil-whisper-large-v3
Distil-Whisper: distil-large-v3 is the third and final installment of the Distil-Whisper English series. It the knowledge distilled version of OpenAI's Whisper large-v3, the latest and most performant Whisper model to date.
Language: Python - Size: 12.7 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0
PatrickTourniaire/ASR-Exam-Revision
ASR course past paper revision work for the University of Edinburgh
Language: TeX - Size: 6.99 MB - Last synced: about 2 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
zhaoyu611/Automatic_Speech_Recognition_with_Multi_Models
A Simple Automatic Speech Recognition (ASR) Model in Tensorflow, which only needs to focus on Deep Neural Network. It's easy to test popular cells (most are LSTM and its variants) and models (unidirectioanl RNN, bidirectional RNN, ResNet and so on). Moreover, you are welcome to play with self-defined cells or models.
Language: Python - Size: 981 MB - Last synced: about 2 months ago - Pushed: over 6 years ago - Stars: 18 - Forks: 8
biodatlab/thonburian-whisper
Thonburian Whisper: Open models for fine-tuned Whisper in Thai. Try our demo on Huggingface space:
Language: Jupyter Notebook - Size: 333 KB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 63 - Forks: 4
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language: Python - Size: 22.7 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 3,606 - Forks: 993
zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Language: Python - Size: 5.53 MB - Last synced: 2 months ago - Pushed: about 1 year ago - Stars: 2,835 - Forks: 539
Anwarvic/RasaChatbot-with-ASR-and-TTS
This repository contains an attempt to incorporate Rasa Chatbot with state-of-the-art ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) models directly without the need of running additional servers or socket connections.
Language: JavaScript - Size: 6.45 MB - Last synced: 15 days ago - Pushed: over 4 years ago - Stars: 20 - Forks: 8
goodmike31/pl-asr-speech-data-survey
Survey of available speech datasets for Polish ASR development
Language: Python - Size: 263 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 4 - Forks: 0
zahraDehghanian97/ASR_Journey
In this repo. we wil use and test different model to automatically transcribe voice(Persian) to text. This task is also called Automatic Speech Recognition (ASR) for Persian language.
Language: Jupyter Notebook - Size: 1.86 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 1
Darveivoldavara/whisper-timestamped
Timestamped ASR microservice
Language: Jupyter Notebook - Size: 3.6 MB - Last synced: 25 days ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0
sinaahmadi/CORDI
Language and Speech Technology for Central Kurdish Varieties (LREC-COLING 2024)
Language: Python - Size: 25.9 MB - Last synced: 15 days ago - Pushed: about 2 months ago - Stars: 8 - Forks: 1
mydroidandi/commbase-stt-whisper-reactive-p
A reactive version of STT engine with remote for Commbase
Language: Python - Size: 370 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 2 - Forks: 0
Anwarvic/Arabic-Speech-Recognition
This repository contains my attempt to use two famous speech recognition frameworks (Kaldi, CMU Sphinx4) for Arabic Language using the publicly-available dataset "Arabic Corpus of Isolated Words"
Language: Shell - Size: 3.24 MB - Last synced: 15 days ago - Pushed: over 4 years ago - Stars: 27 - Forks: 10
double22a/speech_dataset
The dataset of Speech Recognition
Size: 62.5 KB - Last synced: 2 months ago - Pushed: about 1 year ago - Stars: 333 - Forks: 66
mydroidandi/commbase-stt-whisper-proactive-p
A proactive version of STT engine for Commbase
Language: Python - Size: 373 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0
ksquarekumar/whisper-stream
Whisper Transcription Service
Language: Jupyter Notebook - Size: 6.21 MB - Last synced: 2 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0
bbc/bbc-speech-segmenter
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.
Language: Shell - Size: 62.6 MB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 22 - Forks: 2
ksm26/Serverless-LLM-apps-with-Amazon-Bedrock
The course equips you with the skills to deploy Large Language Model (LLM)-based applications into production using serverless technology with Amazon Bedrock.
Language: Jupyter Notebook - Size: 1.66 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 2
sooftware/jasper
PyTorch implementation of "Jasper: An End-to-End Convolutional Neural Acoustic Model" (INTERSPEECH 2019)
Language: Python - Size: 38.1 KB - Last synced: 15 days ago - Pushed: about 3 years ago - Stars: 30 - Forks: 2
hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
Language: Python - Size: 8.66 MB - Last synced: 2 months ago - Pushed: over 2 years ago - Stars: 582 - Forks: 138
prateekralhan/OpenAI_Whisper_ASR
A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models
Language: Python - Size: 10.7 MB - Last synced: 15 days ago - Pushed: over 1 year ago - Stars: 60 - Forks: 15
csikasote/BembaSpeech
This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
Size: 2.41 GB - Last synced: 3 months ago - Pushed: about 1 year ago - Stars: 27 - Forks: 2
egorsmkv/asr-corpus-creator ๐ฆ
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
Language: Python - Size: 2.47 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 27 - Forks: 3
smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Language: Python - Size: 42 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 169 - Forks: 39
googlecreativelab/obvi
A Polymer 3+ webcomponent / button for doing speech recognition
Language: JavaScript - Size: 6.6 MB - Last synced: 17 days ago - Pushed: 4 months ago - Stars: 57 - Forks: 16
OVOSHatchery/ovos-stt-plugin-pocketsphinx
pocketsphinx STT plugin for mycroft
Language: Python - Size: 25.4 KB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 1 - Forks: 1
krylm/whisper-event-tuning
Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.
Language: Python - Size: 9.77 KB - Last synced: 18 days ago - Pushed: over 1 year ago - Stars: 12 - Forks: 2
j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
Language: Kotlin - Size: 3.05 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 4 - Forks: 0
victor369basu/End2EndAutomaticSpeechRecognition
In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
Language: Python - Size: 4.13 MB - Last synced: 15 days ago - Pushed: almost 3 years ago - Stars: 28 - Forks: 11
gopiashokan/Voice-AI-Automatic-Speech-Recognition
Developed a Marathi speech-to-text application using the Hugging Face whisper ASR models. Trained the model with a custom audio dataset and fine-tuned it for optimized performance. Deployed the model on the Hugging Face Model Hub, achieving a WER of 0.74 for the base model.
Language: Jupyter Notebook - Size: 2.02 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
linto-ai/linto-agent
LinTO platform services stack deployment tool for Docker Swarm cluster
Language: JavaScript - Size: 1.01 MB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 14 - Forks: 1
rolczynski/Automatic-Speech-Recognition ๐ฆ
๐ง Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
Language: Python - Size: 3.6 MB - Last synced: 29 days ago - Pushed: almost 4 years ago - Stars: 222 - Forks: 66
mydroidandi/commbase-stt-vosk-p ๐ฆ
An ASR (Automatic Speech Recognition) engine. It is capable of converting spoken language into written text without requiring an internet connection, making it a reliable and secure solution for any application that needs speech-to-text functionality
Language: Python - Size: 3.85 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 1 - Forks: 0
anondo1969/deep-speech-vis
(Master's Thesis) Alam, Mahbub Ul, From Speech to Image: A Novel Approach to Understand the Hidden Layer Mechanisms of Deep Neural Networks in Automatic Speech Recognition, Masterarbeit, Institut fรผr Maschinelle Sprachverarbeitung, Universitรคt Stuttgart, 2017. (https://www.ims.uni-stuttgart.de/en/research/publications/theses/)
Language: Python - Size: 63.5 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
matusstas/openai-whisper-microservice
This is an OpenAI Whisper automatic speech recognition microservice
Language: Python - Size: 786 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 13 - Forks: 1
bytewife/automatic-speech-recognition-guide
Information on the ASR process, but the model needs your help!
Language: Jupyter Notebook - Size: 1.42 MB - Last synced: 4 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0
ckaytev/tgisper
Telegram bot with ASR
Language: Python - Size: 71.3 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 18 - Forks: 1
anton-jeran/FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Language: Python - Size: 4.47 MB - Last synced: 4 months ago - Pushed: 6 months ago - Stars: 128 - Forks: 21