Topic: "automatic-speech-recognition"
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language: Python - Size: 24.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 4,780 - Forks: 1,155

zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Size: 197 KB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 3,064 - Forks: 514

zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Language: Python - Size: 5.53 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 2,842 - Forks: 533

ahmetoner/whisper-asr-webservice
OpenAI Whisper ASR Webservice API
Language: Python - Size: 1.76 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2,693 - Forks: 480

coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Language: C++ - Size: 53.4 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 2,503 - Forks: 298

TEN-framework/ten-vad
Voice Activity Detector(VAD) from TEN: low-latency, high-performance and lightweight
Language: C - Size: 9.6 MB - Last synced at: 12 days ago - Pushed at: 28 days ago - Stars: 1,350 - Forks: 114

kakaobrain/pororo 📦
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Language: Python - Size: 12.8 MB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 1,302 - Forks: 223

TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Language: Python - Size: 90.3 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 985 - Forks: 240

FireRedTeam/FireRedASR
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.
Language: Python - Size: 658 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 873 - Forks: 63

snakers4/open_stt 📦
Open STT
Language: Python - Size: 87.9 KB - Last synced at: 10 days ago - Pushed at: over 3 years ago - Stars: 801 - Forks: 84

jitsi/jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
Language: Python - Size: 1.68 MB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 781 - Forks: 105

EmulationAI/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
Size: 6.56 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 691 - Forks: 43

shirayu/whispering 📦
Streaming transcriber with whisper
Language: Python - Size: 288 KB - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 686 - Forks: 53

Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
Language: Python - Size: 502 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 635 - Forks: 72

FluidInference/FluidAudio
Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.
Language: Swift - Size: 13.6 MB - Last synced at: about 18 hours ago - Pushed at: about 18 hours ago - Stars: 604 - Forks: 70

hirofumi0810/neural_sp
End-to-end ASR/LM implementation with PyTorch
Language: Python - Size: 8.66 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 596 - Forks: 139

YoavRamon/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Size: 18.6 KB - Last synced at: 29 days ago - Pushed at: over 3 years ago - Stars: 537 - Forks: 83

jonatasgrosman/huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Language: Python - Size: 598 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 462 - Forks: 45

Picovoice/leopard
On-device speech-to-text engine powered by deep learning
Language: Python - Size: 419 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 459 - Forks: 29

Z-yq/TensorflowASR
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
Language: Python - Size: 266 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 444 - Forks: 107

double22a/speech_dataset
The dataset of Speech Recognition
Size: 74.2 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 413 - Forks: 77

ArthurFDLR/whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper
Language: Jupyter Notebook - Size: 124 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 396 - Forks: 115

leduckhai/MultiMed
[LREC-COLING 2024 (Oral), Interspeech 2024 (Oral), NAACL 2025, ACL 2025] A Series of Multilingual Multitask Medical Speech Processing
Language: Python - Size: 22.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 350 - Forks: 36

hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Language: Python - Size: 4.17 MB - Last synced at: 9 months ago - Pushed at: over 7 years ago - Stars: 313 - Forks: 120

smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Language: Python - Size: 45.9 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 232 - Forks: 41

rolczynski/Automatic-Speech-Recognition 📦
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
Language: Python - Size: 3.6 MB - Last synced at: 11 days ago - Pushed at: about 5 years ago - Stars: 225 - Forks: 63

NavodPeiris/speechlib
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Language: Python - Size: 33.9 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 224 - Forks: 22

m3hrdadfi/soxan
Wav2Vec for speech recognition, classification, and audio classification
Language: Jupyter Notebook - Size: 3.57 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 197 - Forks: 28

bricewalker/Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Language: Jupyter Notebook - Size: 2.88 GB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 192 - Forks: 40

sovaai/sova-asr
SOVA ASR (Automatic Speech Recognition)
Language: Python - Size: 2.32 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 172 - Forks: 22

vilassn/whisper_android
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
Language: C++ - Size: 187 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 164 - Forks: 22

noco-ai/spellbook-docker
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
Language: Shell - Size: 2.39 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 163 - Forks: 13

CoEDL/elpis
🙊 software for creating speech recognition models.
Language: Python - Size: 82.5 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 159 - Forks: 33

biodatlab/thonburian-whisper
Thonburian Whisper: Open models for fine-tuned Whisper in Thai. Try our demo on Huggingface space:
Language: Jupyter Notebook - Size: 786 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 144 - Forks: 16

anton-jeran/FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Language: Python - Size: 4.47 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 143 - Forks: 26

ieasybooks/tafrigh
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Language: Python - Size: 631 KB - Last synced at: 32 minutes ago - Pushed at: 5 months ago - Stars: 135 - Forks: 18

tugstugi/mongolian-speech-recognition
Mongolian speech recognition with PyTorch
Language: Python - Size: 164 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 134 - Forks: 52

dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
Language: Python - Size: 289 MB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 130 - Forks: 54

at16k/at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Language: Python - Size: 268 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 129 - Forks: 18

lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
Language: Python - Size: 365 KB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 123 - Forks: 12

kmario23/KenLM-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 114 - Forks: 21

andi611/ZeroSpeech-TTS-without-T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
Language: Python - Size: 99.2 MB - Last synced at: 5 months ago - Pushed at: almost 6 years ago - Stars: 112 - Forks: 12

BatuhanYilmaz26/Auto-Subtitled-Video-Generator
Input a YouTube video link or upload a video file and get a video with subtitles.
Language: Python - Size: 122 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 87 - Forks: 36

j3soon/whisper-to-input
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
Language: Kotlin - Size: 3.31 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 86 - Forks: 13

mgonzs13/whisper_ros
Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2
Language: C++ - Size: 1.94 MB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 81 - Forks: 19

LearnedVector/Wav2Letter
Speech Recognition model based off of FAIR research paper built using Pytorch.
Language: Python - Size: 186 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 81 - Forks: 24

lkmeta/txtify
Web application that converts audio and video to text using AI, supporting various formats and self-hosting.
Language: Python - Size: 6.77 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 80 - Forks: 7

undertheseanlp/automatic_speech_recognition
Vietnamese Automatic Speech Recognition
Language: Python - Size: 131 MB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 69 - Forks: 38

MingLunHan/CIF-PyTorch
[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).
Language: Python - Size: 106 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 68 - Forks: 6

khakers/go-subgen
Automatically generate subtitles for your media using whisper.cpp via webhooks with support for Radarr & Sonarr
Language: Go - Size: 7.61 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 66 - Forks: 1

PyThaiNLP/pythaiasr
Python Thai Automatic Speech Recognition
Language: Python - Size: 178 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 66 - Forks: 13

hirofumi0810/asr_preprocessing
Python implementation of pre-processing for End-to-End speech recognition
Language: Python - Size: 1.67 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 66 - Forks: 22

zmeet-ai/asr_demo
语音识别API,分实时语音和长语音离线上传识别,支持中英文等多达100个国家的语言实时转写和同声传译
Language: Java - Size: 23.1 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 63 - Forks: 6

01Zhangbw/Speech-and-audio-papers-Top-Conference
It includes papers on speech&audio field. Now update: ICLR2025-2023, ICML2025-2023, NeurIPS2024-2023, ACMMM2024, AAAI2025-2024, ACL2025-2024, EMNLP2024, NAACL2025, IJCAI2024, ECCV2024
Size: 290 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 61 - Forks: 1

prateekralhan/OpenAI_Whisper_ASR
A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models
Language: Python - Size: 10.7 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 60 - Forks: 15

tsmdt/whisply
💬 Transcribe, translate, diarize, annotate and subtitle video (and audio) with Whisper on Win, Linux and Mac ... fast!
Language: Python - Size: 4.07 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 59 - Forks: 13

googlecreativelab/obvi 📦
A Polymer 3+ webcomponent / button for doing speech recognition
Language: JavaScript - Size: 6.69 MB - Last synced at: 9 days ago - Pushed at: 20 days ago - Stars: 59 - Forks: 16

jonatasgrosman/asrecognition
ASRecognition: just an easy-to-use library for Automatic Speech Recognition.
Language: Python - Size: 106 KB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 5

archiki/Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.
Language: Python - Size: 141 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 46 - Forks: 10

brianlan/automatic-speech-recognition
Automatic Speech Recognition using Tensorflow
Language: Python - Size: 114 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 46 - Forks: 16

loretoparisi/hf-experiments
Experiments with Hugging Face 🔬 🤗
Language: Python - Size: 20.5 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 44 - Forks: 5

double22a/asr_nlp_paper_code
Papers of ASR, Tools of ASR
Size: 655 MB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 40 - Forks: 9

sungnyun/ARMHuBERT
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
Language: Python - Size: 4.52 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 39 - Forks: 6

saurabhchalke/whisper-meta-quest
Running speech-to-text in a Meta Quest headset using OpenAI's Whisper tiny model
Language: C# - Size: 98.1 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 39 - Forks: 3

30stomercury/Automatic-Speech-Recognition
End-to-End Speech Recognition Using Tensorflow
Language: Python - Size: 1.93 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 39 - Forks: 8

ttop32/wav2vec2-live-japanese-translator
real time japanese speech recognition translator using wav2vec2
Language: Jupyter Notebook - Size: 926 KB - Last synced at: 4 days ago - Pushed at: about 3 years ago - Stars: 39 - Forks: 3

fabio-sim/Fast-SeamlessM4T-ONNX 📦
ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
Language: Python - Size: 371 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 37 - Forks: 0

pyyush/SpecAugment
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Language: Python - Size: 3.02 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 37 - Forks: 8

mozilla-ai/speech-to-text-finetune
Blueprint by Mozilla.ai for finetuning a Speech-To-Text model in your own language
Language: Python - Size: 5.24 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 36 - Forks: 4

soheil-mp/Speech-Recognition
End-to-End Speech Recognition using Neural Networks.
Language: Jupyter Notebook - Size: 15.5 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 35 - Forks: 21

George0828Zhang/torch_cif
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Language: Python - Size: 167 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 33 - Forks: 3

lucasgris/wav2vec4bp
Wav2vec resources and models for Brazilian Portuguese
Language: Jupyter Notebook - Size: 1.65 MB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 33 - Forks: 2

loretoparisi/wave2vec-recognize-docker
Wave2vec 2.0 Recognize pipeline
Language: Python - Size: 33.2 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 33 - Forks: 10

j3soon/speech-to-windows-input
Perform speech-to-text (STT/ASR) with Azure speech service and simulate keyboard to input the recognized text; Supports English, Chinese, Japanese, and more.
Language: C# - Size: 2.4 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 32 - Forks: 3

sooftware/jasper
PyTorch implementation of "Jasper: An End-to-End Convolutional Neural Acoustic Model" (INTERSPEECH 2019)
Language: Python - Size: 38.1 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 32 - Forks: 2

drumpt/SGEM
Official PyTorch implementation of SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization (INTERSPEECH 2023 Oral Presentation)
Language: Python - Size: 24.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 31 - Forks: 3

kssteven418/Q-ASR
[ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition
Language: Jupyter Notebook - Size: 41.9 MB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 31 - Forks: 2

GAMMA-UMD/TS-RIR
Translating Synthetic RIRs to Real RIRs
Language: Python - Size: 2.24 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 29 - Forks: 7

GAMMA-UMD/IR-GAN
Augmenting Room Impulse Response
Language: MATLAB - Size: 7.36 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 29 - Forks: 12

Srijith-rkr/KAUST-Whisper-Adapter
INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!
Language: Python - Size: 5.26 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 28 - Forks: 2

victor369basu/End2EndAutomaticSpeechRecognition
In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
Language: Python - Size: 4.13 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 28 - Forks: 11

egorsmkv/asr-corpus-creator 📦
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
Language: Python - Size: 2.47 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 3

csikasote/BembaSpeech
This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
Size: 2.41 GB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

Anwarvic/Arabic-Speech-Recognition
This repository contains my attempt to use two famous speech recognition frameworks (Kaldi, CMU Sphinx4) for Arabic Language using the publicly-available dataset "Arabic Corpus of Isolated Words"
Language: Shell - Size: 3.24 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 27 - Forks: 10

oleges1/quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
Language: Jupyter Notebook - Size: 116 KB - Last synced at: 9 months ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 7

exemplaryai/ai-engine
Easy to use Multi-Provider ASR/Speech To Text and NLP engine
Size: 5.15 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 25 - Forks: 0

Livyatan-melvillei/ai-clips-maker
AI-powered tool to turn long videos into short, viral-ready clips. Combines transcription, speaker diarization, scene detection & 9:16 resizing — perfect for creators & smart automation.
Language: Python - Size: 69.3 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 24 - Forks: 3

pariajm/sharif-emotional-speech-dataset
A large-scale validated database for Persian speech emotion detection.
Size: 13.4 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 24 - Forks: 9

gary083/GAN_Harmonized_with_HMMs
Code:Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Language: Shell - Size: 8.53 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 24 - Forks: 5

The-Data-Dilemma/MediBeng-Whisper-Tiny
MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.
Language: Python - Size: 2.24 MB - Last synced at: 18 days ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 2

srinivr/kaldi-long-audio-alignment
Long audio alignment using Kaldi
Language: Shell - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 23 - Forks: 10

ckaytev/tgisper
Telegram bot with ASR
Language: Python - Size: 125 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 22 - Forks: 3

matusstas/openai-whisper-microservice
This is an OpenAI Whisper automatic speech recognition microservice
Language: Python - Size: 791 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 22 - Forks: 2

bbc/bbc-speech-segmenter
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.
Language: Shell - Size: 62.6 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 22 - Forks: 2

popcornell/MicRank
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.
Language: Python - Size: 76.2 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 22 - Forks: 4

chimechallenge/chime-utils
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
Language: Python - Size: 2.63 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 21 - Forks: 3

Anwarvic/RasaChatbot-with-ASR-and-TTS
This repository contains an attempt to incorporate Rasa Chatbot with state-of-the-art ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) models directly without the need of running additional servers or socket connections.
Language: JavaScript - Size: 6.45 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 20 - Forks: 8

stefanpantic/asr
Automatic speech recognition using neural networks
Language: Python - Size: 143 MB - Last synced at: 7 months ago - Pushed at: almost 5 years ago - Stars: 19 - Forks: 1

egorsmkv/whisper-ukrainian 📦
Trainer and Evaluation scripts for fine-tuning Whisper models for the Ukrainian language
Language: Python - Size: 69.3 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 0

gheyret/uyghur-asr-ctc
Speech Recognition for Uyghur using deep learning
Language: Python - Size: 6.6 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 18 - Forks: 3
