An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: asr-model

fernicar/Parakeet_GUI_TINS_Edition

A desktop application built using the TINS paradigm for transcribing audio files into timed text and previsualization.

Language: Python - Size: 1.94 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

hwk06023/SONATA

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

Language: Python - Size: 413 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

sine2pi/asr_model_zero

ASR model with (optional Betweenness), optional blending of spectrogram and waveform input. 0 for padding masking silence . No additive masking.

Language: Python - Size: 583 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

pr0mila/MediBeng-Whisper-Tiny

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.

Language: Python - Size: 2.22 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 10 - Forks: 0

fullscreen-triangle/vibrio

High-Precision Human Velocity Analysis Framework

Language: Python - Size: 4.18 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

maximkm/DLA_ASR_HW

ASR pytorch project

Language: Python - Size: 815 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

wh0isdsmith/MediBeng-Whisper-Tiny

MediBeng Whisper Tiny improves doctor-patient transcription by training the Whisper Tiny model to translate mixed Bengali-English speech into English, making it easier for analysis, record-keeping, and using AI in healthcare.

Language: Python - Size: 646 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

yandex-cloud-examples/yc-speechkit-streams-recognizer

SpeechKit Streaming Recognizer.

Language: Python - Size: 639 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 0

kr4ckhe4d/NLP-Examples

A collection of Python scripts demonstrating how to run various AI tasks locally using models from the Hugging Face Hub and the transformers library (along with related libraries like datasets, sentence-transformers, etc.). These examples cover a range of modalities including text, vision, and audio, showcasing different models and pipelines.

Language: Python - Size: 8.19 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

revdotcom/reverb

Open source inference code for Rev's model

Language: Python - Size: 507 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 399 - Forks: 25

vietai/ASR

End-to-End Vietnamese Speech Recognition using wav2vec 2.0

Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 98 - Forks: 9

djelia-org/djelia-python-client

this repo contain packages that allow easy interaction with Djelia api.

Language: Python - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

AV55CS/Speech-to-Text-Fine-tuning-Whisper-Tiny-for-Swahili-ASR

ASR-Low resource Language-Swahili

Language: Jupyter Notebook - Size: 1.33 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Shuyib/zindi_mcv_swahilli

How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi

Language: Python - Size: 14.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

robmsmt/SpeechLoop

Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?

Language: Python - Size: 2.41 MB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 0

yandex-cloud-examples/yc-speechkit-stt-java

Пример использования распознавания речи SpeechKit на Java.

Language: Java - Size: 288 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

LuluW8071/Automatic-Speech-Recognition-with-PyTorch

Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡

Language: Python - Size: 4.16 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 9 - Forks: 2

marwan2232004/Esma3nyAPI

This project focuses on converting spoken Egyptian Arabic into written text and translating English text into Arabic. The architecture is inspired by OpenAI's Whisper model and utilizes a custom Transformer-based implementation.

Language: Python - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

MegEngine/End-to-end-ASR-Transformer

An end to end ASR Transformer model training repo

Language: Python - Size: 134 KB - Last synced at: 30 days ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 3

at16k/at16k

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Language: Python - Size: 268 KB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 129 - Forks: 18

MohammadShabazuddin/Audio-Transcript-Translation-with-Whishper

Developed an audio transcription and translation system using OpenAI’s Whisper to convert speech into translated text accurately.

Language: Python - Size: 262 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

mende237/Nda-Nda-Force-Aligner

Forced alignment of Nda‘ Nda’ a Cameroonian language

Language: Shell - Size: 596 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

brettdavies/whisper_chunk_transcribe

The ‘whisper_chunk_transcribe’ repository offers a Python script that utilizes OpenAI’s Whisper model to transcribe audio files in segments, enhancing accuracy and efficiency for lengthy recordings. It supports various audio formats and allows customization of chunk duration and overlap settings to optimize performance.

Language: Python - Size: 1.35 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

sugarcane-mk/finetuning_wav2vec2

This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers

Language: Jupyter Notebook - Size: 42 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Kirili4ik/QuartzNet-ASR-pytorch

Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.

Language: Jupyter Notebook - Size: 1.2 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 3

sovaai/sova-asr

SOVA ASR (Automatic Speech Recognition)

Language: Python - Size: 2.32 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 169 - Forks: 21

artyomboyko/Whisper_Train

Ноутбук для тонкой настройки Whisper на наборе данных Mozilla Сommon Voice.

Language: Jupyter Notebook - Size: 177 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

kingabzpro/WOLOF-ASR-Wav2Vec2

Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.

Language: Jupyter Notebook - Size: 3.34 MB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 8

LuisHBeck/py-stt-tts

ASR, STT and TTS tests with some open source models

Language: Python - Size: 3.68 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

ccoreilly/catalan-speech-recognition-benchmark

A benchmark of speech recognition solutions for the Catalan language

Size: 4.38 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 0

aitor-alvarez/large-speech-models

Fine-tuning Multilingual Large Speech Recognition Models: Wav2vec and Whisper

Language: Python - Size: 84 KB - Last synced at: about 2 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

oleges1/quartznet-pytorch

Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]

Language: Jupyter Notebook - Size: 116 KB - Last synced at: 5 months ago - Pushed at: almost 4 years ago - Stars: 26 - Forks: 7

Nexdata-AI/Conversational_Speech_Dataset

Mega Conversational Speech Datasets for Speech Recognition

Size: 196 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

Nexdata-AI/500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Minan dialect conversational speech

Size: 6.84 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1351-Hours-Mandarin-Conversational-Speech-Data-by-Mobile-Phone-and-Voice-Recorder

Mandarin Conversational Speech Dataset

Size: 366 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1000-Hours-American-English-Conversational-Speech-Data-by-Mobile-Phone

American English Conversational Speech Dataset

Size: 622 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Nexdata-AI/760-Hours-Vietnamese-Speech-Data-by-Mobile-Phone

Vietnamese Speech Dataset

Size: 563 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/117-Hours-Latin-American-Speaking-English-Speech-Data-by-Mobile-Phone

Latin American English Speech Dataset

Size: 2.31 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/800-Hours-Sichuan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Sichuan dialect conversational speech

Size: 614 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

Nexdata-AI/303-Hours-Mixed-Speech-with-Chinese-and-English-Data-by-Mobile-Phone

Mixed Speech with Chinese and English Dataset

Size: 476 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/2028-Hours-Mandarin-Speech-Data-by-Mobile-Phone

Mandarin Speech-Dataset

Size: 381 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/20-People-Infant-Laugh-Speech-Data-by-Mobile-Phone

Infant Laugh Speech Dataset

Size: 1.06 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/20.1-Hours-Chinese-Mandarin-Synthesis-Corpus-Male-Customer-Service

Chinese Mandarin Synthesis Corpus-Male

Size: 2.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/150-People-Chinese-Mandarin-Average-Tone-Speech-Synthesis-Corpus-Customer-Service

Chinese Mandarin Average Tone Speech Synthesis Corpus

Size: 1.01 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Korean-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Korean conversational speech

Size: 559 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/300-Hours-Mixed-Speech-with-Korean-and-English-Data-by-Mobile-Phone

Mixed Speech with Korean and English Dataset

Size: 444 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Nexdata-AI/50-People-Chinese-English-Mixed-Average-Tone-Speech-Synthesis-Corpus-Customer-Service

Chinese English Mixed Average Tone Speech Synthesis Corpus

Size: 1.04 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/231-Hours-French-Speech-Data-by-Mobile-Phone_Reading

French Speech Dataset

Size: 1.26 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Spanish-Conversational-Speech-Data-by-Mobile-Phone

Spanish Conversational Speech Data

Size: 654 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Japanese-Conversational-Speech-by-Mobile-Phone

Japanese Conversational Speech Dataset

Size: 536 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/Interspeech2020-Accented-English-Speech-Recognition-Competition-Data

Interspeech2020 Accented English Speech Recognition Competition Data

Size: 566 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/240-Hours-Hindi-Speech-Data-by-Mobile-Phone_Reading

Hindi Speech Dataset

Size: 1.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Nexdata-AI/2003-Hours-Mandarin-Speech-Data-by-Mobile-Phone-Financial-Sector

Mandarin Speech Dataset

Size: 347 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/178-Hours-Chinese-Children-Speech-Data-by-Microphone

Chinese Children Speech Data

Size: 1.59 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1002-Hours-Kunming-Dialect-Speech-Data-by-Mobile-Phone

Kunming Dialect Speech Dataset

Size: 929 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 1

Nexdata-AI/200-People-Chinese-Wake-up-Words-Speech-Data-by-Mobile-Phone

Chinese Wake-up Words Speech Dataset

Size: 1.77 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Nexdata-AI/205-People-Mandarin-Speech-Data-in-Noisy-Environment-by-Mobile-Phone_Guiding

Mandarin Speech Dataset in Noisy Environment

Size: 2.18 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/998-People-Mic-Array-Speech-Data-in-Home-Environment

Mic-Array Speech Dataset in Home Environment

Size: 6.2 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/347-Hours-Italian-Speech-Data-Collected-by-Mobile-Phone

Italian Speech Dataset

Size: 1.02 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/592-People-Number-Speech-Data-in-Mandarin-and-Dialects-by-Mobile-Phone

Number Speech Dataset in Mandarin and Dialects

Size: 8.79 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/176-Hours-Suzhou-Dialect-Speech-Data-by-Mobile-Phone

Suzhou Dialect Speech Dataset

Size: 1.13 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/248-Hours-Hangzhou-Dialect-Speech-Data-by-Mobile-Phone

Hangzhou Dialect Speech Dataset

Size: 1.34 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/738-Hours-Uyghur-Speech-Data-by-Mobile-Phone

Uyghur Speech Dataset

Size: 725 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/211-people-Korean-Speech-Data-by-Mobile-Phone_Guiding

Korean Speech Dataset

Size: 2.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/55-Hours-British-Children-Speech-Data-by-Microphone

British English Speech Dataset

Size: 1.25 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/351-People-German-Speech-Data-by-Mobile-Phone_Guiding

German Speech Dataset

Size: 566 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/292-Hours-Thai-Speech-Data-by-Mobile-Phone_Reading

Thai Speech Dataset

Size: 1.29 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Nexdata-AI/359-Hours-Indonesian-Speech-Data-by-Mobile-Phone_Reading

Indonesian Speech Dataset

Size: 948 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 1

Nexdata-AI/41-Hours-Chinese-Young-Children-Speech-Data-by-Mobile-Phone-and-Microphone

Chinese Young Children Speech Dataset

Size: 1.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1420-Hours-Mandarin-Spontaneous-Speech-Data-by-Mobile-Phone

Mandarin Spontaneous Speech Dataset

Size: 484 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/199-Hours-British-English-Speech-Data-by-Mobile-Phone_Reading

British English Speech Dataset

Size: 1.54 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/261-Hours-Japanese-Speech-Data-by-Mobile-Phone

Japanese Speech Dataset

Size: 433 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1012-Hours-Indian-English-Speech-Data-by-Mobile-Phone

Indian English Speech Dataset

Size: 427 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/997-Hours-Wuhan-Dialect-Speech-Data-by-Mobile-Phone

Wuhan Dialect Speech Dataset

Size: 797 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/986-Hours-European-Portuguese-Speech-Data-by-Mobile-Phone

European Portuguese Speech Dataset

Size: 529 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/762-Hours-Non-Hispanic-Spanish-Speech-Data-by-Mobile-Phone

Non Hispanic Spanish Speech Dataset

Size: 426 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

daisyyedda/whisper-large-v2-atcosim_corpus

A fine-tuned Whisper model (whisper-large-v2) for aviation audio transcription. WER < 5%.

Language: Jupyter Notebook - Size: 90.8 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

samkamau81/Baini

This Program converts English Speech to Kikuyu Text (A low resourced language in Kenya) Text using AI - Speech-to-Text Service

Language: Jupyter Notebook - Size: 981 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

isadrtdinov/quartznet

QuartzNet implementation for Automatic Speech Recognition task

Language: Python - Size: 69.3 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 2

yandex-cloud-examples/yc-speechkit-async-recognizer

SpeechKit Asynchronous Batch Recognizer.

Language: Python - Size: 197 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

LaurentVeyssier/Automatic-Speech-Recognizer

Build end-to-end Deep Neural Network to translate speech to text (ASR model)

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

antouanbg/Bulgarian_Linguistic

Collection and resources for Bulgarian Corpus, Datasets and Models used in ASR, TTS or NLP tasks together with the links of corresponding tools/apps.

Language: Java - Size: 63.5 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 22 - Forks: 2

juan-csv/GPT3-text-summarization

Summarization, topic generation using GPT3

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 12

fquirin/kaldi-adapt-lm Fork of gooofy/kaldi-adapt-lm

Create and adapt n-gram and JSGF language models, e.g. for Kaldi-ASR nnet3 chain models from Zamia-Speech

Language: Python - Size: 98.6 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 2

alwaz-shahid/whisper-asr-cli

Automatic Speech Recognition ASR / Speech To Text STT demonstration using Whisper/base model. The cli python application transcribe an audio to text, works offline.

Language: Python - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ccoreilly/deepspeech-catala

Deepspeech ASR Model for the Catalan Language

Language: Python - Size: 75.6 MB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 17 - Forks: 0

SzLeaves/asr-webapp

ASR Web APP 中文语音识别实验室APP,使用Django构建,包含中文语音转文字与中文语音聊天机器人模块

Language: Python - Size: 1.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

agoryuno/whisper_encoder_ggml

A stripped down version of whisper.cpp - just the encoder

Language: C - Size: 15.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Nexdata-AI/500-Hours-Henan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Henan Dialect conversational speech

Size: 615 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

Nexdata-AI/500-Hours-Italian-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Italian Speaking English Speech

Size: 500 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

maggieezzat/kaldi-egy-asr

A Kaldi-Recipe for Egyptian Arabic Speech Recognition

Language: Shell - Size: 1.55 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sudarshansangam/Spoken-language-translation-using-conformer-model

Language: Python - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

hammaad2002/SimpleASRmodel

A simple CRDNN based ASR model for my own understanding of how ASR works and are trained. (Work in progress) If anyone finds any error or have any suggestion please do let me know.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

IS2AI/TurkicASR

A multilingual ASR model that can recognize ten Turkic languages—Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek.

Language: Python - Size: 1.16 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 35 - Forks: 5

BudEcosystem/BarkingGPT

Audio to Audio (Whisper+ChatGPT+Bark)

Language: JavaScript - Size: 434 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 2

Hamtech-ai/wav2vec2-fa

fine-tune Wav2vec2. an ASR model released by Facebook

Language: Jupyter Notebook - Size: 549 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 32 - Forks: 3

iamjanvijay/rnnt

An implementation of RNN-Transducer loss in TF-2.0.

Language: Python - Size: 140 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 45 - Forks: 9

a2un/asr-analysis

INFSCI 2935 Final Project. Understanding ASR systems in terms of their fairness by demographics Analyze the ideas discussed in the paper by https://www.pnas.org/doi/epdf/10.1073/pnas.1915768117 and https://arxiv.org/abs/2109.09061

Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

hunterhug/baiduasr

Baidu ASR

Language: Python - Size: 244 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

kalindasiaminwe/ChitongaASR

A natural language processing and machine learning project for a low resource langauge in Zambia.

Language: Jupyter Notebook - Size: 548 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Related Keywords
asr-model 106 asr 67 speech-recognition 64 speech-to-text 58 audio 47 deep-learning 44 dataset 43 speech 37 automatic-speech-recognition 19 machine-learning 16 tts 14 deep-neural-networks 13 speech-synthesis 10 speech-processing 8 voice-recognition 7 whisper 7 nlp 7 python 7 pytorch 6 python3 6 wav 6 wav2vec2 6 transformers 5 stt 5 huggingface 4 ctc-loss 4 openai 4 yandexcloud 3 fine-tuning 3 audio-processing 3 speechkit 3 yandex-cloud 3 nlp-machine-learning 3 yandex-speechkit-api 3 translation 3 whisper-ai 3 facebook 3 seamlessm4t 2 speech-analysis 2 speech-api 2 kaldi-asr 2 timit 2 kenlm 2 huggingface-transformers 2 swahili 2 openai-whisper 2 speechrecognition 2 tts-engines 2 machine-translation 2 pytorch-implementation 2 generative-ai 2 conda-environment 2 code-switch 2 bengali 2 quartznet 2 quartznet-pytorch 2 librispeech 2 speaker-diarization 2 synthetic-data 2 speech-recognition-model 2 catalan 2 catalan-language 2 recognition 2 deepspeech 2 language-model 2 kaldi 2 beam-search 2 code-switching 1 gpt-3 1 sentiment-analysis 1 summarization 1 topic-modeling 1 vosk 1 jsgf-grammars 1 g2p 1 finetuning-whisper 1 arabic-speech-recognition 1 finetuning-wav2vec 1 text-to-speech 1 datasets 1 smarthome 1 voice-interaction 1 interspeech 1 kagglexbipoc 1 kikuyu 1 human-machine-interaction 1 conversational-ai 1 ljspeech 1 gru 1 common-voice 1 keras 1 rnn 1 temporal-convolutional-network 1 bulgarian-dataset 1 bulgarian-models 1 lematization 1 large-speech-models 1 stemmer 1 ngram-models 1 transformer 1