Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: speech

Repositories

OvidijusParsiunas/deep-chat

Fully customizable AI chatbot component for your website

Language: TypeScript - Size: 67.2 MB - Last synced: about 2 hours ago - Pushed: 1 day ago - Stars: 1,137 - Forks: 146

Rikorose/DeepFilterNet

Noise supression using deep filtering

Language: Python - Size: 171 MB - Last synced: 8 minutes ago - Pushed: 1 day ago - Stars: 1,974 - Forks: 182

santi-pdp/segan

Speech Enhancement Generative Adversarial Network in TensorFlow

Language: Python - Size: 771 KB - Last synced: about 7 hours ago - Pushed: about 1 year ago - Stars: 793 - Forks: 279

huggingface/datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language: Python - Size: 84.2 MB - Last synced: about 8 hours ago - Pushed: about 10 hours ago - Stars: 18,516 - Forks: 2,530

LitoMore/mac-say

The macOS built-in `say` interface for JavaScript

Language: TypeScript - Size: 26.4 KB - Last synced: about 5 hours ago - Pushed: about 12 hours ago - Stars: 2 - Forks: 0

HumeAI/hume-python-sdk

Python client for Hume AI APIs

Language: Python - Size: 2.91 MB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 55 - Forks: 12

daniilrobnikov/vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Language: Jupyter Notebook - Size: 7.35 MB - Last synced: about 7 hours ago - Pushed: 8 months ago - Stars: 385 - Forks: 42

bhk3824/Whisper-Fine-Tuning-For-Pronunciation-Learning

Fine Tuning of Whisper Speech To Text Base Model For Pronunciation Learning

Language: Jupyter Notebook - Size: 207 KB - Last synced: about 21 hours ago - Pushed: 23 days ago - Stars: 1 - Forks: 0

modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Language: Python - Size: 52.5 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 6,146 - Forks: 649

IAHispano/Applio

VITS-based Voice Conversion focused on simplicity, quality and performance.

Language: Python - Size: 25 MB - Last synced: about 14 hours ago - Pushed: 1 day ago - Stars: 1,083 - Forks: 179

TranscribeJs/transcribe.js

Monorepo for Transcribe.js

Language: JavaScript - Size: 31.4 MB - Last synced: about 18 hours ago - Pushed: 1 day ago - Stars: 2 - Forks: 0

lhotse-speech/lhotse

Tools for handling speech data in machine learning projects.

Language: Python - Size: 30.9 MB - Last synced: about 16 hours ago - Pushed: 1 day ago - Stars: 868 - Forks: 202

IDEA-Research/Grounded-Segment-Anything

Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language: Jupyter Notebook - Size: 120 MB - Last synced: 1 day ago - Pushed: 29 days ago - Stars: 13,652 - Forks: 1,251

netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language: Python - Size: 3.73 MB - Last synced: 1 day ago - Pushed: 3 months ago - Stars: 6,386 - Forks: 536

svc-develop-team/so-vits-svc 📦

SoftVC VITS Singing Voice Conversion

Language: Python - Size: 10.6 MB - Last synced: 1 day ago - Pushed: 6 months ago - Stars: 24,193 - Forks: 4,594

verbio-technologies/python-verbio-speech-center

Python integration with the Verbio Speech Center Cloud. https://speechcenter.verbio.com/

Language: Python - Size: 31.6 MB - Last synced: about 2 hours ago - Pushed: 1 day ago - Stars: 5 - Forks: 1

mozilla/TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Language: Jupyter Notebook - Size: 120 MB - Last synced: 1 day ago - Pushed: 6 months ago - Stars: 8,867 - Forks: 1,206

balisujohn/tortoise.cpp

A ggml (C++) re-implementation of tortoise-tts. Under construction and seeking contributors.

Language: C++ - Size: 41.4 MB - Last synced: about 20 hours ago - Pushed: 1 day ago - Stars: 68 - Forks: 3

Audio-WestlakeU/FN-SSL

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization

Language: Python - Size: 160 KB - Last synced: about 21 hours ago - Pushed: 1 day ago - Stars: 63 - Forks: 6

echogarden-project/echogarden

Integrated speech toolset designed to be accessible to end-users. Fully open-source.

Language: TypeScript - Size: 1.46 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 75 - Forks: 9

pszemraj/vid2cleantxt

Python API & command-line tool to easily transcribe speech-based video files into clean text

Language: Jupyter Notebook - Size: 723 MB - Last synced: about 8 hours ago - Pushed: over 1 year ago - Stars: 159 - Forks: 24

MiteshPuthran/Speech-Emotion-Analyzer

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Language: Jupyter Notebook - Size: 4.85 MB - Last synced: about 21 hours ago - Pushed: over 1 year ago - Stars: 1,242 - Forks: 425

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language: Python - Size: 162 MB - Last synced: 1 day ago - Pushed: 6 days ago - Stars: 29,804 - Forks: 3,528

qingsongedu/awesome-AI-tutorials-surveys

A professional list of Tutorials and Surveys on DL, ML, DM, CV, NLP, Speech in top AI conferences and journals.

Size: 121 KB - Last synced: 1 day ago - Pushed: over 1 year ago - Stars: 110 - Forks: 13

sensein/senselab

PipePal is a Python package that simplifies building pipelines for speech and voice analysis.

Language: Python - Size: 11.6 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 0 - Forks: 1

AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Language: Python - Size: 23 MB - Last synced: 1 day ago - Pushed: about 1 month ago - Stars: 9,796 - Forks: 833

feldberlin/timething

Timething is a library for aligning text transcripts with their audio recordings.

Language: Jupyter Notebook - Size: 29.8 MB - Last synced: about 7 hours ago - Pushed: 6 months ago - Stars: 76 - Forks: 6

KevKibe/African-Whisper

🚀 Seamlessly fine-tune and deploy Whisper model on a multi-lingual dataset.

Language: Python - Size: 59.3 MB - Last synced: 4 days ago - Pushed: 7 days ago - Stars: 11 - Forks: 2

SWHL/AI-Competition-Collections

AI比赛经验帖子 & 训练和测试技巧帖子集锦（收集整理各种人工智能比赛经验帖）

Language: Python - Size: 15.7 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 263 - Forks: 33

MuSAELab/Multimodal-dataset-catalog

This repository lists publicly available datasets for visual-audio, speech and audio, and biomedical signal related tasks.

Size: 78.1 KB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 1 - Forks: 1

weirongxu/auditory-reader

:book: A Speech Reader, Support Epub, URL, Text.

Language: TypeScript - Size: 24 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 3 - Forks: 0

DigitalPhonetics/IMS-Toucan

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Language: Python - Size: 14.4 MB - Last synced: 2 days ago - Pushed: 2 days ago - Stars: 455 - Forks: 82

marcogdepinto/emotion-classification-from-audio-files

Understanding emotions from audio files using neural networks and multiple datasets.

Language: Python - Size: 646 MB - Last synced: 3 days ago - Pushed: 11 months ago - Stars: 396 - Forks: 130

Mohamad-Hussein/speech-assistant

Desktop application for Linux and Windows that utilizes distil-whisper models from HuggingFace, to enable real-time offline speech-to-text dictation.

Language: Python - Size: 445 KB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 36 - Forks: 1

interactiveaudiolab/ppgs

High-Fidelity Neural Phonetic Posteriorgrams

Language: Python - Size: 98.5 MB - Last synced: about 23 hours ago - Pushed: 2 days ago - Stars: 55 - Forks: 4

jim60105/docker-whisperX

Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)

Language: Dockerfile - Size: 314 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 83 - Forks: 11

pndurette/gTTS

Python library and CLI tool to interface with Google Translate's text-to-speech API

Language: Python - Size: 517 KB - Last synced: 2 days ago - Pushed: 11 days ago - Stars: 2,151 - Forks: 348

hanayik/Discourse-Assessments

Various speech assessments commonly used during post stroke aphasia research

Language: JavaScript - Size: 47.7 MB - Last synced: 5 days ago - Pushed: over 5 years ago - Stars: 3 - Forks: 1

sergiozc/signal-processing-scripts

Some simulation macros related to signal processing

Language: MATLAB - Size: 24.5 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 1 - Forks: 0

maxrmorrison/pyfoal

Python forced alignment

Language: Jupyter Notebook - Size: 7.16 MB - Last synced: 4 days ago - Pushed: about 1 month ago - Stars: 58 - Forks: 4

maxrmorrison/pypar

Phoneme alignment representation compatible with multiple forced aligners

Language: Python - Size: 446 KB - Last synced: 4 days ago - Pushed: about 1 month ago - Stars: 15 - Forks: 1

khanhuitse05/speech-and-text-unity-ios-android

Speed to text in Unity iOS use Native Speech Recognition

Language: C# - Size: 76.5 MB - Last synced: 4 days ago - Pushed: 3 months ago - Stars: 272 - Forks: 124

Giooorgiooo/TikTok-Voice-TTS

Simple Python script to interact with the TikTok TTS Voices.

Language: Python - Size: 1.7 MB - Last synced: 6 days ago - Pushed: 6 days ago - Stars: 24 - Forks: 4

macairececile/picto_grammar

Code from the paper "A Multimodal French Corpus of Aligned Speech, Text, and Pictogram Sequences for Speech-to-Pictogram Machine Translation" (LREC-Coling 2024)

Language: Python - Size: 28.3 MB - Last synced: 5 days ago - Pushed: 6 days ago - Stars: 0 - Forks: 0

Audio-WestlakeU/FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Language: Python - Size: 892 KB - Last synced: about 7 hours ago - Pushed: 9 months ago - Stars: 508 - Forks: 148

DengBoCong/nlp-paper

自然语言处理领域下的相关论文（附阅读笔记），复现模型以及数据处理等（代码含TensorFlow和PyTorch两版本）

Language: Python - Size: 48.4 MB - Last synced: 5 days ago - Pushed: 4 months ago - Stars: 1,127 - Forks: 184

interactiveaudiolab/emphases

Crowdsourced and Automatic Speech Prominence Estimation

Language: Python - Size: 23.1 MB - Last synced: 4 days ago - Pushed: about 1 month ago - Stars: 8 - Forks: 1

mobilepadawan/Speakit-JS

Elevate your web applications with the power of JavaScript speech synthesis.

Language: JavaScript - Size: 25.5 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 2 - Forks: 0

George0828Zhang/torch_cif

A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.

Language: Python - Size: 167 KB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 29 - Forks: 3

nipponjo/tts-arabic-pytorch

TTS models for Arabic (Tacotron2, FastPitch)

Language: Jupyter Notebook - Size: 3.12 MB - Last synced: 7 days ago - Pushed: 7 days ago - Stars: 55 - Forks: 14

Azure-Samples/SpeechToText-WebSockets-Javascript 📦

SDK & Sample to do speech recognition using websockets in Javascript

Language: TypeScript - Size: 878 KB - Last synced: 5 days ago - Pushed: about 5 years ago - Stars: 213 - Forks: 188

fakerybakery/utmos

A toolkit to calculate speech audio quality. Not affiliated with the original authors

Language: Python - Size: 49.8 KB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 9 - Forks: 2

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Language: Jupyter Notebook - Size: 109 KB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 2,072 - Forks: 213

google/speedy

Speedy Non-linear Speech Speedup Algorithm

Language: C++ - Size: 24.9 MB - Last synced: 8 days ago - Pushed: 7 months ago - Stars: 24 - Forks: 5

googleapis/nodejs-speech 📦

This repository is deprecated. All of its content and history has been moved to googleapis/google-cloud-node.

Size: 11.4 MB - Last synced: 8 days ago - Pushed: 10 months ago - Stars: 686 - Forks: 290

csteinmetz1/ai-audio-startups

Community list of startups working with AI in audio and music technology

Size: 204 KB - Last synced: 8 days ago - Pushed: 2 months ago - Stars: 1,454 - Forks: 123

yeyupiaoling/MASR

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。

Language: Python - Size: 6.25 MB - Last synced: 8 days ago - Pushed: 8 days ago - Stars: 534 - Forks: 97

YoavRamon/awesome-kaldi

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Size: 18.6 KB - Last synced: 3 days ago - Pushed: over 2 years ago - Stars: 531 - Forks: 85

BakerBunker/SALT

[ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation

Language: Python - Size: 19.6 MB - Last synced: 8 days ago - Pushed: 9 days ago - Stars: 13 - Forks: 1

bupt-ai-club/awesomeProject

记录有意思的AI相关项目

Size: 108 MB - Last synced: 3 days ago - Pushed: about 1 month ago - Stars: 6 - Forks: 2

Sreyan88/MMER

Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition

Language: Python - Size: 1.59 MB - Last synced: 3 days ago - Pushed: 2 months ago - Stars: 54 - Forks: 14

nheidloff/unity-watson-ar-sample

Augmented Reality Sample using IBM Watson, Unity and Vuforia

Language: C# - Size: 2.12 MB - Last synced: 9 days ago - Pushed: almost 6 years ago - Stars: 4 - Forks: 6

CUNY-CL/wikipron

Massively multilingual pronunciation mining

Language: Python - Size: 163 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 289 - Forks: 66

metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

Language: Python - Size: 19.8 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 3,066 - Forks: 424

avinashkranjan/Amazing-Python-Scripts

🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

Language: Jupyter Notebook - Size: 946 MB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 2,018 - Forks: 878

ddlBoJack/Speech-Resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

Size: 4.83 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 418 - Forks: 56

Subrata2402/textspeaker

This is a simple text-to-speech program that reads text from a file or input text and speaks it out loud.

Language: JavaScript - Size: 62.5 KB - Last synced: 6 days ago - Pushed: 11 days ago - Stars: 1 - Forks: 0

wotschofsky/discord-live-translator

Voice Translation Bot for Discord

Language: TypeScript - Size: 812 KB - Last synced: 9 days ago - Pushed: 10 days ago - Stars: 61 - Forks: 13

ina-foss/inaSpeechSegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Language: Python - Size: 34.8 MB - Last synced: 2 days ago - Pushed: 2 months ago - Stars: 696 - Forks: 125

openspeech-team/openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Language: Python - Size: 7.49 MB - Last synced: 9 days ago - Pushed: 7 months ago - Stars: 655 - Forks: 111

SahilAggarwal2004/react-text-to-speech

An easy to use react component for the Web Speech API.

Language: TypeScript - Size: 267 KB - Last synced: 10 days ago - Pushed: 11 days ago - Stars: 8 - Forks: 3

tensorflow/lingvo

Lingvo

Language: Python - Size: 142 MB - Last synced: 7 days ago - Pushed: 10 days ago - Stars: 2,777 - Forks: 434

google/voice-builder

An opensource text-to-speech (TTS) voice building tool

Language: JavaScript - Size: 1.35 MB - Last synced: 8 days ago - Pushed: about 2 months ago - Stars: 630 - Forks: 132

lucoiso/UEAzSpeech

This plugin integrates Azure Speech Cognitive Services in Unreal Engine.

Language: C++ - Size: 161 MB - Last synced: 11 days ago - Pushed: 11 days ago - Stars: 172 - Forks: 39

shhossain/BanglaSpeech2Text

BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.

Language: Python - Size: 1.83 MB - Last synced: 6 days ago - Pushed: 5 months ago - Stars: 51 - Forks: 10