GitHub topics: asr-model

Repositories

sine2pi/asr_model

NLP model with acoustic positional encoding.

Language: Python - Size: 638 KB - Last synced at: about 10 hours ago - Pushed at: about 10 hours ago - Stars: 1 - Forks: 0

sine2pi/asr-rotary

Maps pitch / f0 of audio samples to rotary theta. Variable pitch radius.

Language: Python - Size: 85 KB - Last synced at: about 18 hours ago - Pushed at: about 18 hours ago - Stars: 1 - Forks: 0

mende237/Nda-Nda-Force-Aligner

Forced alignment of Nda‘ Nda’ a Cameroonian language

Language: Shell - Size: 727 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

haizero55/sussu

sussu is a command-line tool that uses OpenAI's Whisper to transcribe audio and video easily. 🌟 Get started by installing sussu and access the `sussu` and `whisper` commands in your terminal. 🐙

Language: Python - Size: 55.7 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

chen-ze-yuan/SCCM

SCCM

Language: Python - Size: 8.04 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

yandex-cloud-examples/yc-speechkit-async-recognizer

SpeechKit Asynchronous Batch Recognizer.

Language: Python - Size: 198 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1 - Forks: 0

yandex-cloud-examples/yc-speechkit-streams-recognizer

SpeechKit Streaming Recognizer.

Language: Python - Size: 643 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 0

djelia-org/djelia-python-sdk

this repo contain packages that allow easy interaction with Djelia api.

Language: Python - Size: 82 KB - Last synced at: 13 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 1

yandex-cloud-examples/yc-speechkit-stt-java

Пример использования распознавания речи SpeechKit на Java.

Language: Java - Size: 288 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

OmeshThokchom/N7speech

Manipuri ASR – A state-of-the-art, low-latency speech-to-text library with advanced voice activity detection and real-time transcription, purpose-built for the Manipuri language.

Language: Python - Size: 269 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

revdotcom/reverb

Open source inference code for Rev's model

Language: Python - Size: 507 KB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 401 - Forks: 26

fullscreen-triangle/vibrio

High-Precision Human Velocity Analysis Framework

Language: Python - Size: 33.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

fernicar/Parakeet_GUI_TINS_Edition

A desktop application built using the TINS paradigm for transcribing audio files into timed text and previsualization.

Language: Python - Size: 1.94 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

hwk06023/SONATA

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

Language: Python - Size: 632 KB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

maximkm/DLA_ASR_HW

ASR pytorch project

Language: Python - Size: 815 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

wh0isdsmith/MediBeng-Whisper-Tiny

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.

Language: Python - Size: 646 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kr4ckhe4d/NLP-Examples

A collection of Python scripts demonstrating how to run various AI tasks locally using models from the Hugging Face Hub and the transformers library (along with related libraries like datasets, sentence-transformers, etc.). These examples cover a range of modalities including text, vision, and audio, showcasing different models and pipelines.

Language: Python - Size: 8.19 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

vietai/ASR

End-to-End Vietnamese Speech Recognition using wav2vec 2.0

Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 98 - Forks: 9

AV55CS/Speech-to-Text-Fine-tuning-Whisper-Tiny-for-Swahili-ASR

ASR-Low resource Language-Swahili

Language: Jupyter Notebook - Size: 1.33 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Shuyib/zindi_mcv_swahilli

How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi

Language: Python - Size: 14.6 KB - Last synced at: about 13 hours ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

robmsmt/SpeechLoop

Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?

Language: Python - Size: 2.41 MB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 0

LuluW8071/Automatic-Speech-Recognition-with-PyTorch

Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡

Language: Python - Size: 4.16 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 9 - Forks: 2

marwan2232004/Esma3nyAPI

This project focuses on converting spoken Egyptian Arabic into written text and translating English text into Arabic. The architecture is inspired by OpenAI's Whisper model and utilizes a custom Transformer-based implementation.

Language: Python - Size: 89.8 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

MegEngine/End-to-end-ASR-Transformer

An end to end ASR Transformer model training repo

Language: Python - Size: 134 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 3

at16k/at16k

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Language: Python - Size: 268 KB - Last synced at: 11 days ago - Pushed at: about 4 years ago - Stars: 129 - Forks: 18

MohammadShabazuddin/Audio-Transcript-Translation-with-Whishper

Developed an audio transcription and translation system using OpenAI’s Whisper to convert speech into translated text accurately.

Language: Python - Size: 262 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

brettdavies/whisper_chunk_transcribe

The ‘whisper_chunk_transcribe’ repository offers a Python script that utilizes OpenAI’s Whisper model to transcribe audio files in segments, enhancing accuracy and efficiency for lengthy recordings. It supports various audio formats and allows customization of chunk duration and overlap settings to optimize performance.

Language: Python - Size: 1.35 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

sugarcane-mk/finetuning_wav2vec2

This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers

Language: Jupyter Notebook - Size: 42 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Kirili4ik/QuartzNet-ASR-pytorch

Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.

Language: Jupyter Notebook - Size: 1.2 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 3

sovaai/sova-asr

SOVA ASR (Automatic Speech Recognition)

Language: Python - Size: 2.32 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 169 - Forks: 21

artyomboyko/Whisper_Train

Ноутбук для тонкой настройки Whisper на наборе данных Mozilla Сommon Voice.

Language: Jupyter Notebook - Size: 177 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

kingabzpro/WOLOF-ASR-Wav2Vec2

Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.

Language: Jupyter Notebook - Size: 3.34 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 8

LuisHBeck/py-stt-tts

ASR, STT and TTS tests with some open source models

Language: Python - Size: 3.68 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

ccoreilly/catalan-speech-recognition-benchmark

A benchmark of speech recognition solutions for the Catalan language

Size: 4.38 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 0

aitor-alvarez/large-speech-models

Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper

Language: Python - Size: 84 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

oleges1/quartznet-pytorch

Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]

Language: Jupyter Notebook - Size: 116 KB - Last synced at: 7 months ago - Pushed at: almost 4 years ago - Stars: 26 - Forks: 7

Nexdata-AI/Conversational_Speech_Dataset

Mega Conversational Speech Datasets for Speech Recognition

Size: 196 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 3 - Forks: 0

Nexdata-AI/500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Minan dialect conversational speech

Size: 6.84 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1351-Hours-Mandarin-Conversational-Speech-Data-by-Mobile-Phone-and-Voice-Recorder

Mandarin Conversational Speech Dataset

Size: 366 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1000-Hours-American-English-Conversational-Speech-Data-by-Mobile-Phone

American English Conversational Speech Dataset

Size: 622 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

Nexdata-AI/760-Hours-Vietnamese-Speech-Data-by-Mobile-Phone

Vietnamese Speech Dataset

Size: 563 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/117-Hours-Latin-American-Speaking-English-Speech-Data-by-Mobile-Phone

Latin American English Speech Dataset

Size: 2.31 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/800-Hours-Sichuan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Sichuan dialect conversational speech

Size: 614 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 3 - Forks: 0

Nexdata-AI/303-Hours-Mixed-Speech-with-Chinese-and-English-Data-by-Mobile-Phone

Mixed Speech with Chinese and English Dataset

Size: 476 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/2028-Hours-Mandarin-Speech-Data-by-Mobile-Phone

Mandarin Speech-Dataset

Size: 381 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/20-People-Infant-Laugh-Speech-Data-by-Mobile-Phone

Infant Laugh Speech Dataset

Size: 1.06 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/20.1-Hours-Chinese-Mandarin-Synthesis-Corpus-Male-Customer-Service

Chinese Mandarin Synthesis Corpus-Male

Size: 2.35 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/150-People-Chinese-Mandarin-Average-Tone-Speech-Synthesis-Corpus-Customer-Service

Chinese Mandarin Average Tone Speech Synthesis Corpus

Size: 1.01 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Korean-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Korean conversational speech

Size: 559 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/300-Hours-Mixed-Speech-with-Korean-and-English-Data-by-Mobile-Phone

Mixed Speech with Korean and English Dataset

Size: 444 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

Nexdata-AI/50-People-Chinese-English-Mixed-Average-Tone-Speech-Synthesis-Corpus-Customer-Service

Chinese English Mixed Average Tone Speech Synthesis Corpus

Size: 1.04 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/231-Hours-French-Speech-Data-by-Mobile-Phone_Reading

French Speech Dataset

Size: 1.26 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Spanish-Conversational-Speech-Data-by-Mobile-Phone

Spanish Conversational Speech Data

Size: 654 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Japanese-Conversational-Speech-by-Mobile-Phone

Japanese Conversational Speech Dataset

Size: 536 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/Interspeech2020-Accented-English-Speech-Recognition-Competition-Data

Interspeech2020 Accented English Speech Recognition Competition Data

Size: 566 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/240-Hours-Hindi-Speech-Data-by-Mobile-Phone_Reading

Hindi Speech Dataset

Size: 1.4 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

Nexdata-AI/2003-Hours-Mandarin-Speech-Data-by-Mobile-Phone-Financial-Sector

Mandarin Speech Dataset

Size: 347 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/178-Hours-Chinese-Children-Speech-Data-by-Microphone

Chinese Children Speech Data

Size: 1.59 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1002-Hours-Kunming-Dialect-Speech-Data-by-Mobile-Phone

Kunming Dialect Speech Dataset

Size: 929 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

Nexdata-AI/200-People-Chinese-Wake-up-Words-Speech-Data-by-Mobile-Phone

Chinese Wake-up Words Speech Dataset

Size: 1.77 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

Nexdata-AI/205-People-Mandarin-Speech-Data-in-Noisy-Environment-by-Mobile-Phone_Guiding

Mandarin Speech Dataset in Noisy Environment

Size: 2.18 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/998-People-Mic-Array-Speech-Data-in-Home-Environment

Mic-Array Speech Dataset in Home Environment

Size: 6.2 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/347-Hours-Italian-Speech-Data-Collected-by-Mobile-Phone

Italian Speech Dataset

Size: 1.02 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/592-People-Number-Speech-Data-in-Mandarin-and-Dialects-by-Mobile-Phone

Number Speech Dataset in Mandarin and Dialects

Size: 8.79 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/176-Hours-Suzhou-Dialect-Speech-Data-by-Mobile-Phone

Suzhou Dialect Speech Dataset

Size: 1.13 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/248-Hours-Hangzhou-Dialect-Speech-Data-by-Mobile-Phone

Hangzhou Dialect Speech Dataset

Size: 1.34 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/738-Hours-Uyghur-Speech-Data-by-Mobile-Phone

Uyghur Speech Dataset

Size: 725 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/211-people-Korean-Speech-Data-by-Mobile-Phone_Guiding

Korean Speech Dataset

Size: 2.35 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/55-Hours-British-Children-Speech-Data-by-Microphone

British English Speech Dataset

Size: 1.25 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/351-People-German-Speech-Data-by-Mobile-Phone_Guiding

German Speech Dataset

Size: 566 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/292-Hours-Thai-Speech-Data-by-Mobile-Phone_Reading

Thai Speech Dataset

Size: 1.29 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

Nexdata-AI/359-Hours-Indonesian-Speech-Data-by-Mobile-Phone_Reading

Indonesian Speech Dataset

Size: 948 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 1

Nexdata-AI/41-Hours-Chinese-Young-Children-Speech-Data-by-Mobile-Phone-and-Microphone

Chinese Young Children Speech Dataset

Size: 1.4 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1420-Hours-Mandarin-Spontaneous-Speech-Data-by-Mobile-Phone

Mandarin Spontaneous Speech Dataset

Size: 484 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/199-Hours-British-English-Speech-Data-by-Mobile-Phone_Reading

British English Speech Dataset

Size: 1.54 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/261-Hours-Japanese-Speech-Data-by-Mobile-Phone

Japanese Speech Dataset

Size: 433 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1012-Hours-Indian-English-Speech-Data-by-Mobile-Phone

Indian English Speech Dataset

Size: 427 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/997-Hours-Wuhan-Dialect-Speech-Data-by-Mobile-Phone

Wuhan Dialect Speech Dataset

Size: 797 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/986-Hours-European-Portuguese-Speech-Data-by-Mobile-Phone

European Portuguese Speech Dataset

Size: 529 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

Nexdata-AI/762-Hours-Non-Hispanic-Spanish-Speech-Data-by-Mobile-Phone

Non Hispanic Spanish Speech Dataset

Size: 426 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

daisyyedda/whisper-large-v2-atcosim_corpus

A fine-tuned Whisper model (whisper-large-v2) for aviation audio transcription. WER < 5%.

Language: Jupyter Notebook - Size: 90.8 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

samkamau81/Baini

This Program converts English Speech to Kikuyu Text (A low resourced language in Kenya) Text using AI - Speech-to-Text Service

Language: Jupyter Notebook - Size: 981 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

isadrtdinov/quartznet

QuartzNet implementation for Automatic Speech Recognition task

Language: Python - Size: 69.3 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 2

LaurentVeyssier/Automatic-Speech-Recognizer

Build end-to-end Deep Neural Network to translate speech to text (ASR model)

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

antouanbg/Bulgarian_Linguistic

Collection and resources for Bulgarian Corpus, Datasets and Models used in ASR, TTS or NLP tasks together with the links of corresponding tools/apps.

Language: Java - Size: 63.5 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 22 - Forks: 2

juan-csv/GPT3-text-summarization

Summarization, topic generation using GPT3

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 12

fquirin/kaldi-adapt-lm Fork of gooofy/kaldi-adapt-lm

Create and adapt n-gram and JSGF language models, e.g. for Kaldi-ASR nnet3 chain models from Zamia-Speech

Language: Python - Size: 98.6 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 2

Related Keywords

asr-model 109 asr 67 speech-recognition 64 speech-to-text 58 audio 46 deep-learning 44 dataset 43 speech 37 automatic-speech-recognition 19 machine-learning 16 tts 14 deep-neural-networks 13 speech-synthesis 11 nlp 9 speech-processing 8 whisper 7 voice-recognition 7 python3 6 stt 6 wav 6 python 6 wav2vec2 6 pytorch 6 openai 4 huggingface 4 ctc-loss 4 nlp-machine-learning 4 whisper-ai 3 speechkit 3 yandex-cloud 3 transformers 3 yandex-speechkit-api 3 yandexcloud 3 facebook 3 open-source 2 tts-engines 2 speech-analysis 2 artificial-intelligence 2 low-resource-languages 2 seamlessm4t 2 timit 2 kaldi-asr 2 swahili 2 fine-tuning 2 librispeech 2 huggingface-transformers 2 speaker-diarization 2 speechrecognition 2 recognition 2 translation 2 pytorch-implementation 2 quartznet 2 openai-whisper 2 quartznet-pytorch 2 beam-search 2 kaldi 2 language-model 2 audio-processing 2 speech-recognition-model 2 srt 2 machine-translation 2 speech-api 2 deepspeech 2 catalan-language 2 kenlm 2 transcription 2 catalan 2 code-switching 1 kikuyu 1 text-to-speech 1 kagglexbipoc 1 smarthome 1 datasets 1 ljspeech 1 gru 1 keras 1 interspeech 1 manipur 1 torch 1 wav2vec2-large-960h 1 wav2letter 1 africa 1 wolof 1 meta 1 seamless 1 catala 1 vosk 1 arabic-speech-recognition 1 finetuning-wav2vec 1 finetuning-whisper 1 large-speech-models 1 common-voice 1 conversational-ai 1 human-machine-interaction 1 voice-interaction 1 rnn 1 timit-dataset 1 turkic-languages 1 bark 1 chatgpt 1