Topic: "rnnt"
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language: Python - Size: 100 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 10,292 - Forks: 1,036

stevenhillis/awesome-asr-contextualization
A curated list of awesome papers on contextualizing E2E ASR outputs
Size: 59.6 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 77 - Forks: 9

upskyy/Transformer-Transducer
PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
Language: Python - Size: 85.9 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 74 - Forks: 19

iamjanvijay/rnnt_decoder_cuda
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
Language: Cuda - Size: 187 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 61 - Forks: 9

iamjanvijay/rnnt
An implementation of RNN-Transducer loss in TF-2.0.
Language: Python - Size: 140 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 45 - Forks: 9

manhph2211/ViSTT
I'm building an end-to-end Vietnamese Speech Recognition System. I'll deploy it into production with the help of Flask, Uwsgi, Nginx, and AWS ...
Language: Python - Size: 2.92 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 1

tuanio/conformer-rnnt
Conformer RNN-Transducer
Language: Python - Size: 51.8 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

aidayang/FunASR-OneClick
FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件
Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

George0828Zhang/ssnt_loss
Pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction" https://arxiv.org/abs/1609.08194
Language: Python - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

Andersonjesusvital/Speech-Recognition-RNN
Deep learning-based subtitle generation model that processes audio datasets to generate accurate text transcriptions. Includes audio feature extraction, encoder-decoder architecture, training pipelines, and evaluation metrics for subtitle alignment.
Size: 1.95 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
