Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: automatic-speech-recognition

anondo1969/deep-speech-vis

(Master's Thesis) Alam, Mahbub Ul, From Speech to Image: A Novel Approach to Understand the Hidden Layer Mechanisms of Deep Neural Networks in Automatic Speech Recognition, Masterarbeit, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart, 2017. (https://www.ims.uni-stuttgart.de/en/research/publications/theses/)

Language: Python - Size: 63.5 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

matusstas/openai-whisper-microservice

This is an OpenAI Whisper automatic speech recognition microservice

Language: Python - Size: 786 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 13 - Forks: 1

bytewife/automatic-speech-recognition-guide

Information on the ASR process, but the model needs your help!

Language: Jupyter Notebook - Size: 1.42 MB - Last synced: 5 months ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

ckaytev/tgisper

Telegram bot with ASR

Language: Python - Size: 71.3 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 18 - Forks: 1

anton-jeran/FAST-RIR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Language: Python - Size: 4.47 MB - Last synced: 4 months ago - Pushed: 7 months ago - Stars: 128 - Forks: 21

loretoparisi/hf-experiments

Experiments with Hugging Face 🔬 🤗

Language: Python - Size: 20.5 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 44 - Forks: 7

ShafakatArnob/Automatic-Bengali-Subtitle-Generation-Deep-Learning

Automatic Subtitle Generation for Bengali Multimedia Using Deep Learning.

Language: Jupyter Notebook - Size: 13.3 MB - Last synced: 5 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

Prasanna-Pawar21/End-to-End-multilingual-speech-translation-using-ASR-and-NLP.

Text-To-Speech-Text (TTST) simplifies tech for everyone, turning written text into spoken words. It's a computer system that reads any input aloud, promoting accessibility. English to desired language TTST aids in localizing computer applications, enhancing user understanding.

Language: HTML - Size: 111 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

khakers/go-subgen

Automatically generate subtitles for your media using whisper.cpp via webhooks with support for Radarr & Sonarr

Language: Go - Size: 7.57 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 49 - Forks: 0

RajGothi/Improving-Automatic-Speech-Recognition-with-Dialect-Specific-Language-Models

This repository contains the implementation of our published paper titled 'Improving Automatic Speech Recognition with Dialect-Specific Language Models,' presented at SPECOM'23.

Language: Jupyter Notebook - Size: 1000 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

bhattbhavesh91/whisper-youtube

This repository will guide you to create automatically generate YouTube Transcription using Using OpenAI's Whisper

Language: Jupyter Notebook - Size: 10.7 KB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 9 - Forks: 5

alwaz-shahid/whisper-asr-cli

Automatic Speech Recognition ASR / Speech To Text STT demonstration using Whisper/base model. The cli python application transcribe an audio to text, works offline.

Language: Python - Size: 9.77 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

bagustris/detect-segment-cough

A python model to detect and segment coughs, forked from coughvid's repo

Language: Jupyter Notebook - Size: 818 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 8 - Forks: 3

bookbot-hive/k2-indonesian-asr

Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).

Language: Python - Size: 591 KB - Last synced: about 1 month ago - Pushed: 11 months ago - Stars: 1 - Forks: 1

soham2109/CS-753-Project

Automatic Speech Recognition Course Project.

Language: Python - Size: 129 MB - Last synced: 6 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

InquestGeronimo/whisperyt

Python client for Gladia's ASR API for YouTube transcription

Language: Python - Size: 2.53 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

SMIL-SPCRAS/EMOLIPS

EMOLIPS: TWO-LEVEL APPROACH FOR LIP-READING EMOTIONAL SPEECH

Language: Python - Size: 6.92 MB - Last synced: 6 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Minnan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Minan dialect conversational speech

Size: 5.86 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 1 - Forks: 0

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

Language: C++ - Size: 187 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 12 - Forks: 2

Hafpaf/ASR_subtitles

Generate talk subtitles with OpenAI Whisper

Language: Python - Size: 40 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 1 - Forks: 1

drumpt/SGEM

Official PyTorch implementation of SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization (INTERSPEECH 2023 Oral Presentation)

Language: Python - Size: 24.8 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 10 - Forks: 3

BatuhanYilmaz26/Auto-Subtitled-Video-Generator

Input a YouTube video link or upload a video file and get a video with subtitles.

Language: Python - Size: 122 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 48 - Forks: 17

Heyyassinesedjari/Arabic-Speech-To-Moroccan-Sign-Language-Web-Application

This web app translates Arabic speech or text into Moroccan Sign Language videos, fostering communication between the hearing and Moroccan deaf communities.

Language: Python - Size: 6.8 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

andi611/ZeroSpeech-TTS-without-T

A Pytorch implementation for the ZeroSpeech 2019 challenge.

Language: Python - Size: 99.2 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 110 - Forks: 12

sephiroce/srf

Supplementary files for the sequential routing framework

Language: Python - Size: 38.1 KB - Last synced: 4 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

m3hrdadfi/soxan

Wav2Vec for speech recognition, classification, and audio classification

Language: Jupyter Notebook - Size: 3.57 MB - Last synced: 7 months ago - Pushed: about 2 years ago - Stars: 197 - Forks: 28

CoEDL/elpis_next

A simple transcription workflow GUI for linguists and data scientists.

Language: TypeScript - Size: 2.13 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

Z-yq/TensorflowASR

一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1

Language: Python - Size: 266 MB - Last synced: 7 months ago - Pushed: 9 months ago - Stars: 444 - Forks: 107

jpdiazpardo/gutural_nlp

Gutural and scream automatic speech recognition (ASR) system using a fine-tuned version of OpenAI's Whisper model

Language: Jupyter Notebook - Size: 1.56 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

RobertoAlessandri/DataScienceTask

Language: Jupyter Notebook - Size: 24.4 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

akshathmangudi/LipNet-Torch

A SOTA PyTorch implementation of the LipNet model from the paper "LipNet: End-to-End Sentence-level Lipreading"

Language: Jupyter Notebook - Size: 67.4 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

inboxpraveen/ASR-Accuracy-Tool

🎙️📝 A powerful Flask-based web application that leverages the latest Hugging Face ASR models to provide real-time speech-to-text (STT) transcripts with an intuitive user interface for easy correction. Perfect for enhancing the quality of training datasets for ASR models. 🚀

Language: Python - Size: 9.05 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 0 - Forks: 0

tugstugi/mongolian-speech-recognition

Mongolian speech recognition with PyTorch

Language: Python - Size: 164 KB - Last synced: 7 months ago - Pushed: about 3 years ago - Stars: 122 - Forks: 50

bhattbhavesh91/table-question-answering-with-automatic-speech-recognition

Question Answering Gradio Interface on Tabular Data with HuggingFace Transformers Pipeline & TAPAS Wav2Vec2 is a pretrained model for Automatic Speech Recognition (ASR)

Language: Jupyter Notebook - Size: 11.7 KB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 6 - Forks: 2

EdoWhite/GIVA

GIVA - GPT-Based Vocal Virtual Assistant

Language: Python - Size: 231 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

Algo-Boys/SWR2-ASR

Automatic speech recognition model for the Spoken Word Recognition seminar (SWR2) Tübingen

Language: Python - Size: 13.3 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 2 - Forks: 0

zaanind/Asr_CTC

This repository provides a Jupyter notebook for (CTC) based Automatic Speech Recognition (ASR) system using TensorFlow and Keras. The primary focus of this repository is to demonstrate the implementation of a CTC ASR model and to show how to train it effectively on the "Yes No" dataset.

Language: Jupyter Notebook - Size: 1.19 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

KarinTho/master-thesis-STT

How does a pre-trained speech model fine-tuned on prepared speech behave differently compared to a model that is fine-tuned on spontaneous speech?

Language: Jupyter Notebook - Size: 11.4 MB - Last synced: 9 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

CoEDL/elpis_lib

The Core Elpis Library.

Language: Python - Size: 2.56 MB - Last synced: 19 days ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

Mohamed-Ashik-S/Speech-to-Text

This is a Speech to text project which uses openAI's Whisper model.

Language: Jupyter Notebook - Size: 1.03 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

PetePrattis/automatic-speech-recognision-system-ASR

A python script that implements an automatic speech recognision system.

Language: Python - Size: 15.6 KB - Last synced: 9 months ago - Pushed: over 4 years ago - Stars: 2 - Forks: 0

kssteven418/Q-ASR

[ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition

Language: Jupyter Notebook - Size: 41.9 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 29 - Forks: 2

BScUniversityCollaborations/automatic-speech-recognition

Created an ASR (Automatic Speech Recognition) system that takes in individual recordings. Each recording represents a sentence composed of 5-10 English language digits, separated by adequate pauses. The system involves segmenting the sentence using a classifier, differentiating between background and foreground sounds.

Language: Python - Size: 8.3 MB - Last synced: about 2 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

Nexdata-AI/Conversational_Speech_Dataset

Mega Conversational Speech Datasets for Speech Recognition

Size: 194 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 3 - Forks: 0

Nexdata-AI/500-Hours-Henan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Henan Dialect conversational speech

Size: 615 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 2 - Forks: 0

Nexdata-AI/474-Hours-Japanese-Speech-Data-By-Mobile-Phone

Japanese Speech Dataset

Size: 3.91 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1012-Hours-Indian-English-Speech-Data-by-Mobile-Phone

Indian English Speech Dataset

Size: 4.88 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/997-Hours-Wuhan-Dialect-Speech-Data-by-Mobile-Phone

Wuhan Dialect Speech Dataset

Size: 4.88 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/3255-Hours-Chinese-Children-Speech-data-by-Mobile-phone

Chinese Children Speech Dataset

Size: 4.88 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/2028-Hours-Mandarin-Speech-Data-by-Mobile-Phone

Mandarin Speech-Dataset

Size: 379 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1027-People-Wake-up-Words-Speech-Data-by-Microphone

Wake-up Words Speech-Dataset

Size: 618 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-German-Conversational-Speech-Data-by-Mobile-Phone

The dataset of German conversational speech

Size: 504 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-French-Conversational-Speech-Data-by-Mobile-Phone

The dataset of French conversational speech

Size: 638 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Italian-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Italian Speaking English Speech

Size: 500 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/500-Hours-Korean-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Korean conversational speech

Size: 3.91 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/178-Hours-Chinese-Children-Speech-Data-by-Microphone

Chinese Children Speech Data

Size: 3.91 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/248-Hours-Hangzhou-Dialect-Speech-Data-by-Mobile-Phone

Hangzhou Dialect Speech Dataset

Size: 3.91 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/1030-Hours-Shanghai-Dialect-Speech-Data-by-Mobile-Phone

Shanghai Dialect Speech Dataset

Size: 5.86 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/800-Hours-Sichuan-Dialect-Conversational-Speech-Data-by-Mobile-Phone

The dataset of Sichuan dialect conversational speech

Size: 612 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 3 - Forks: 0

Nexdata-AI/20-Hours-Chinese-Mandarin-Synthesis-Corpus-Female-Customer-Service-Conversational-Speech

Chinese Mandarin Synthesis Corpus-Female/Customer Service

Size: 1.56 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/607-Hours-Cantonese-Conversational-Speech-Data-by-Mobile-Phone-and-Voice-Recorder

Cantonese Conversational Speech Dataset

Size: 235 KB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 1 - Forks: 0

Nexdata-AI/11010-People-Chinese-Digital-Speech-Data-by-Mobile-Phone

The dataset of Chinese Digital speech

Size: 3.24 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 2 - Forks: 0

ivankunyankin/quartznet-asr

Language: Python - Size: 2.88 MB - Last synced: 8 months ago - Pushed: almost 2 years ago - Stars: 15 - Forks: 4

mushrafi88/asr_bangla

Automatic Speech Recognition system using Wav2Vec-XLSR for Bangla

Language: Jupyter Notebook - Size: 174 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

graphcore/whisper-ai

Speech Recognition (ASR) on Graphcore IPUs using OpenAI's Whisper

Language: Jupyter Notebook - Size: 525 KB - Last synced: 6 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

racai-ai/RobinASR

Romanian Automatic Speech Recognition from the ROBIN project

Language: Python - Size: 204 KB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 17 - Forks: 8

ivallesp/montreal-docker

Language: Dockerfile - Size: 301 KB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

amitchone/ASR

A Python 2.7 implementation of Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) algorithms for Automated Speech Recognition (ASR).

Language: Python - Size: 13.6 MB - Last synced: 10 months ago - Pushed: about 6 years ago - Stars: 13 - Forks: 4

Anwarvic/Web-Interface-for-NVIDIA-NeMo

This repository contains an attempt to utilize the NeMo toolkit created by NVIDIA

Language: Python - Size: 23.4 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 7 - Forks: 1

brianlan/automatic-speech-recognition

Automatic Speech Recognition using Tensorflow

Language: Python - Size: 114 KB - Last synced: 3 months ago - Pushed: almost 7 years ago - Stars: 46 - Forks: 16

anicolson/matlab_feat

Functions for creating speech features in MATLAB.

Language: MATLAB - Size: 39.1 KB - Last synced: 10 months ago - Pushed: almost 4 years ago - Stars: 12 - Forks: 5

dangvansam/nvidia-nemo-jasper-quartznet-asr-vietnamese

Nhận dạng giọng nói Tiếng Việt sử dụng model Quartznet (Nvidia) + flask demo

Language: Python - Size: 925 MB - Last synced: 10 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

Language: Python - Size: 289 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 65 - Forks: 35

mdhasanai/Bangla_E2E_ASR

Bangla Automatic Speech Recognition

Language: Python - Size: 191 MB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

gawy/aind2-asr-speech-recognition

ASR with Deep Neural Networks

Language: Jupyter Notebook - Size: 16.6 MB - Last synced: 10 months ago - Pushed: about 4 years ago - Stars: 0 - Forks: 0

layman-n-ish/Dialect-Classification

Dialect Classification using techniques of Signal Processing and Machine Learning.

Language: Jupyter Notebook - Size: 636 MB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

PanosAntoniadis/personalized_asr

[MTAP] Official implementation: A mechanism for personalized Automatic Speech Recognition for less frequently spoken languages: the Greek case

Language: Python - Size: 1010 KB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 8 - Forks: 0

ynop/spych 📦

Scripts/Tools used for working with automatic speech recognition.

Language: Python - Size: 3.63 MB - Last synced: 10 months ago - Pushed: about 6 years ago - Stars: 2 - Forks: 2

th-koeln-intia/ip-sprachassistent-team4

The voice assistant Sherlock is a project to create a proof of concept for an offline, open source voice assistant.

Language: Shell - Size: 67.4 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 6 - Forks: 3

Rumeysakeskin/Turkish-Speech-to-Text

Fine-tuning for automatic speech recognition on low-resource languages with character-based CTC model

Language: Jupyter Notebook - Size: 48.4 MB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 13 - Forks: 1

oleges1/quartznet-pytorch

Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]

Language: Jupyter Notebook - Size: 116 KB - Last synced: 10 months ago - Pushed: almost 3 years ago - Stars: 24 - Forks: 7

ttaoREtw/Hidden-Markov-Model-for-Toy-Dataset

The homework of National Taiwan University (Digital Speech Processing Course).

Language: C++ - Size: 782 KB - Last synced: 11 months ago - Pushed: almost 6 years ago - Stars: 1 - Forks: 2

G1ya777/Mmeslay

An Automatic Speech Recognition System for the Kabyle language.

Language: Python - Size: 156 MB - Last synced: 11 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

alefiury/SE-R_2022_Challenge_Wav2vec2

Code for the paper "Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge"

Language: Python - Size: 49.8 KB - Last synced: 16 days ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1

OpenVoiceOS/ovos-stt-plugin-vosk

vosk STT plugin for mycroft

Language: Python - Size: 60.5 KB - Last synced: 4 days ago - Pushed: 5 months ago - Stars: 14 - Forks: 6

iammartian0/Audio_Tasks

Different Task Guides for Audio Data

Language: Jupyter Notebook - Size: 3.91 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 1 - Forks: 0

diaoenmao/Semi-Supervised-Federated-Learing-for-Keyword-Spotting

[ICME 2023] Semi-Supervised Federated Learing for Keyword Spotting

Language: Python - Size: 49.6 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 0 - Forks: 1

MingLunHan/CIF-PyTorch

[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-Fire mechanism).

Language: Python - Size: 60.5 KB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 36 - Forks: 5

jonatasgrosman/asrecognition

ASRecognition: just an easy-to-use library for Automatic Speech Recognition.

Language: Python - Size: 106 KB - Last synced: 6 days ago - Pushed: over 1 year ago - Stars: 51 - Forks: 6

iammartian0/Audio101

Hugging Face Audio coursework

Language: Jupyter Notebook - Size: 13.7 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

zmeet-ai/asr_demo

语音识别API,分实时语音和长语音离线上传识别,支持中英文等多达100个国家的语言实时转写和同声传译

Language: Java - Size: 23.1 MB - Last synced: 11 months ago - Pushed: 11 months ago - Stars: 13 - Forks: 1

j3soon/speech-to-windows-input

Perform speech-to-text (STT/ASR) with Azure speech service and simulate keyboard to input the recognized text; Supports English, Chinese, Japanese, and more

Language: C# - Size: 813 KB - Last synced: 10 months ago - Pushed: 10 months ago - Stars: 19 - Forks: 2

aliyzd95/mShEMO

A modification on the Sharif Emotional Speech Database

Language: Jupyter Notebook - Size: 1.9 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 4 - Forks: 1

loretoparisi/wave2vec-recognize-docker

Wave2vec 2.0 Recognize pipeline

Language: Python - Size: 33.2 KB - Last synced: 8 months ago - Pushed: over 3 years ago - Stars: 33 - Forks: 10

astrologos/py-speakeasy

Speakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.

Language: Jupyter Notebook - Size: 1.09 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

anshulgupta0803/ASSR

ASSR: Automatic Stuttered Speech Recognition

Language: Jupyter Notebook - Size: 33.2 MB - Last synced: 6 months ago - Pushed: about 6 years ago - Stars: 5 - Forks: 1

khaykingleb/Deep-Learning-for-Audio

PyTorch implementation of speech analysis and synthesis models with presentation of experiments

Language: Python - Size: 1.53 MB - Last synced: 12 months ago - Pushed: over 1 year ago - Stars: 3 - Forks: 0

fauxneticien/PTL2-DS2ish

A toy repository for using PyTorch Lightning 2.x to train an adapted DeepSpeech 2 model

Language: Python - Size: 26.4 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

ParthMehta15/Automatic-Speech-Recognition

Language: Jupyter Notebook - Size: 553 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

SanchezCris/SDR-Automatic-Speech-Recognition

FM signal capturing system and voice recognition for the assistance of individuals with hearing impairments.

Language: Python - Size: 48 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

Related Keywords
automatic-speech-recognition 276 speech-recognition 110 asr 105 speech-to-text 98 deep-learning 68 audio 37 machine-learning 36 whisper 34 speech 31 dataset 29 python 29 pytorch 27 stt 24 voice-recognition 20 speech-synthesis 19 asr-model 17 tts 17 tensorflow 16 deep-neural-networks 16 text-to-speech 15 openai 15 transcription 13 kaldi 12 wav2vec2 12 speech-processing 12 natural-language-processing 12 huggingface 11 wav 10 audio-processing 10 transformer 8 kaldi-asr 8 neural-network 8 ctc 7 nlp 7 librispeech 7 whisper-ai 7 transformers 7 translation 7 huggingface-transformers 7 attention-mechanism 6 keras 6 docker 6 language-model 6 android 6 deepspeech 6 mfcc 5 ctc-loss 5 rnn 5 python3 5 cnn 5 speech-enhancement 5 vosk 5 artificial-intelligence 5 neural-networks 5 youtube 5 fine-tuning 5 jasper 5 voice 4 end-to-end 4 speaker-recognition 4 wer 4 word-error-rate 4 quartznet 4 ai 4 subtitles 4 conversational-ai 4 deepspeech2 4 timit-dataset 4 openai-whisper 4 conformer 3 vietnamese-nlp 3 sentiment-analysis 3 recurrent-neural-networks 3 wav2letter 3 speech-emotion-recognition 3 pytorch-lightning 3 commbase 3 deep-speech 3 lstm 3 kenlm 3 engine 3 chinese-speech-recognition 3 pocketsphinx 3 tflite 3 subtitles-generator 3 end-to-end-learning 3 synthetic-data 3 room-impulse-response 3 tensorflow2 3 generative-adversarial-network 3 speech-translation 3 streamlit 3 voice-assistant 3 signal-processing 3 lip-reading 3 rnn-transducer 3 speech-recognizer 3 low-resource-languages 3 music-information-retrieval 3 timit 3