An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: speech-synthesis

Swap98-Coder/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Size: 1.95 KB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 0 - Forks: 0

NevilPatel01/RVC-WebUI-MacOS

Optimized Retrieval-based Voice Conversion WebUI for Apple Silicon Macs (M1/M2/M3). Real-time, high-quality voice conversion with an easy web interface. All models included!

Language: Python - Size: 981 KB - Last synced at: about 12 hours ago - Pushed at: about 13 hours ago - Stars: 1 - Forks: 0

netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language: Python - Size: 3.67 MB - Last synced at: about 19 hours ago - Pushed at: 9 months ago - Stars: 7,953 - Forks: 685

abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Language: Python - Size: 78 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,648 - Forks: 271

Blaizzy/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Language: Python - Size: 4.07 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,063 - Forks: 80

EveryVoiceTTS/EveryVoice

The EveryVoice TTS Toolkit - Text To Speech for your language

Language: Python - Size: 9.84 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 33 - Forks: 2

MahtaFetrat/ManaTTS-Persian-Speech-Dataset

ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 25 - Forks: 1

dragonhub0710/image-to-speech

A python project for converting an Image into audible sound using OCR and speech synthesis

Language: Python - Size: 160 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Language: Python - Size: 299 KB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 4,011 - Forks: 441

DiffAPF/torchlpc

Fast and differentiable time domain all-pole filter in PyTorch.

Language: Python - Size: 69.3 KB - Last synced at: about 12 hours ago - Pushed at: 5 days ago - Stars: 61 - Forks: 4

espeak-ng/espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

Language: C - Size: 72.7 MB - Last synced at: 2 days ago - Pushed at: 27 days ago - Stars: 5,032 - Forks: 1,009

AlekPet/ComfyUI_Custom_Nodes_AlekPet

Custom nodes that extend the capabilities of Comfyui

Language: JavaScript - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,183 - Forks: 74

stakira/OpenUtau

Open singing synthesis platform / Open source UTAU successor

Language: C# - Size: 77.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,702 - Forks: 347

microsoft/SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language: Python - Size: 17.8 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 1,347 - Forks: 126

devnen/Dia-TTS-Server

Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.

Language: Python - Size: 31.2 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 147 - Forks: 27

rhasspy/piper

A fast, local neural text to speech system

Language: C++ - Size: 208 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 8,826 - Forks: 681

ManimCommunity/manim-voiceover

Manim plugin for all things voiceover

Language: Python - Size: 879 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 217 - Forks: 46

leon-ai/leon

🧠 Leon is your open-source personal assistant.

Language: TypeScript - Size: 21.3 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 16,220 - Forks: 1,347

thorstenMueller/Thorsten-Voice

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Language: Python - Size: 16.6 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 610 - Forks: 53

amirivojdan/shekar

Simplifying Persian NLP for Everyone

Language: Python - Size: 467 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 1

rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Language: Python - Size: 2.08 MB - Last synced at: 4 days ago - Pushed at: 7 days ago - Stars: 8,126 - Forks: 771

NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language: Python - Size: 435 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 13,794 - Forks: 2,806

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language: Python - Size: 162 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 39,795 - Forks: 5,066

crispinprojects/talkcalendar

Talk Calendar is a personal desktop calendar for Linux which has some speech capability.

Language: C - Size: 136 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

baxtree/wiki2ssml

Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.

Language: JavaScript - Size: 396 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 35 - Forks: 0

RHVoice/RHVoice

a free and open source speech synthesizer for Russian and other languages

Language: C++ - Size: 14.3 MB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 1,639 - Forks: 242

WhisperSpeech/WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language: Jupyter Notebook - Size: 38 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 4,233 - Forks: 235

Erangamadhushan/EM956-Community-Assistant

EM956 Community Assistant for EM956 Community Support Web portrail

Language: JavaScript - Size: 5.86 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

voicepaw/so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language: Python - Size: 20.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8,992 - Forks: 1,197

Gmzxdotzz/Dia-TTS-Server

Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.

Language: Python - Size: 572 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

kakaobrain/pororo 📦

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

Language: Python - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 1,296 - Forks: 223

Lyrcaxis/KokoroSharp

Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.

Language: C# - Size: 159 KB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 109 - Forks: 5

echogarden-project/echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

Language: TypeScript - Size: 1.72 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 359 - Forks: 39

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Size: 197 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 3,034 - Forks: 510

KoljaB/RealtimeTTS

Converts text to speech in realtime

Language: Python - Size: 68.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2,948 - Forks: 291

mmorise/World

A high-quality speech analysis, manipulation and synthesis system

Language: C++ - Size: 878 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 1,229 - Forks: 257

OpenVoiceOS/ovos-tts-plugin-cotovia

galician tts plugin for OVOS

Language: Python - Size: 1.51 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1

denizsafak/abogen

Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

Language: Python - Size: 1.64 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 183 - Forks: 5

FolhaSP/mosaico

🎬 Open-source programmatic video composition framework with AI capabilities for Python

Language: Python - Size: 1.86 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 13 - Forks: 0

slp-rl/aero

This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)

Language: Python - Size: 159 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 221 - Forks: 31

scruss/micropython-SYN6988

MicroPython library for the VoiceTX SYN6988 text to speech module

Language: Python - Size: 1.22 MB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 15 - Forks: 1

andresayac/edge-tts

Edge TTS is a Node or Bun package that allows access to the online text-to-speech service used by Microsoft Edge without the need for Microsoft Edge, Windows, or an API key.

Language: TypeScript - Size: 42 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 49 - Forks: 1

Avatar-Home-Automation/A.V.A.T.A.R-Server

Agnostic Virtual Assistant for The Automated Residences

Language: JavaScript - Size: 22.5 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0

athena-team/athena

an open-source implementation of sequence-to-sequence based speech processing engine

Language: C++ - Size: 9.94 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 947 - Forks: 189

PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language: Python - Size: 69.4 MB - Last synced at: 9 days ago - Pushed at: 19 days ago - Stars: 11,845 - Forks: 1,904

espnet/espnet

End-to-End Speech Processing Toolkit

Language: Python - Size: 1.13 GB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 9,043 - Forks: 2,250

stefantaubert/mel-cepstral-distance

A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".

Language: Python - Size: 59.8 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 53 - Forks: 10

taigrr/elevenlabs

ElevenLabs Artificial Voice Synthesis Client

Language: Go - Size: 105 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 45 - Forks: 14

DmitryRyumin/INTERSPEECH-2023-24-Papers

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Size: 11.4 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 667 - Forks: 42

snakers4/silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Language: Jupyter Notebook - Size: 488 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 5,253 - Forks: 336

NaomiProject/Naomi

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Language: Python - Size: 5.27 MB - Last synced at: about 14 hours ago - Pushed at: 4 months ago - Stars: 274 - Forks: 60

CodersCreative/natural-tts

A rust crate for easily implementing Text-To-Speech into your rust programs.

Language: Rust - Size: 199 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 10 - Forks: 5

tensorflow/lingvo

Lingvo

Language: Python - Size: 142 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 2,838 - Forks: 451

mkiol/dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Language: C++ - Size: 74.6 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 859 - Forks: 35

Azure-Samples/Cognitive-Speech-TTS

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Language: C# - Size: 822 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 952 - Forks: 521

sidharthrajaram/StyleTTS2

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

Language: Python - Size: 131 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 159 - Forks: 36

guest271314/SpeechSynthesisRecorder

Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types

Language: JavaScript - Size: 40 KB - Last synced at: 2 days ago - Pushed at: almost 7 years ago - Stars: 82 - Forks: 20

JackismyShephard/ultimate-rvc Fork of SociallyIneptWeeb/AICoverGen

An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.

Language: Python - Size: 7.35 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 81 - Forks: 23

rhasspy/piper-samples

Samples for Piper text to speech system

Language: Python - Size: 559 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 6 - Forks: 3

bshall/hifigan

An 16kHz implementation of HiFi-GAN for soft-vc.

Language: Python - Size: 101 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 99 - Forks: 25

SocAIty/SpeechCraft Fork of suno-ai/bark

🔊 Text2Speech, Voice-Cloning and Voice2Voice conversion with the text-prompted generative audio model bark

Language: Python - Size: 9.78 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 62 - Forks: 6

MoonInTheRiver/DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Language: Python - Size: 61.9 MB - Last synced at: 12 days ago - Pushed at: about 2 months ago - Stars: 4,475 - Forks: 739

spokestack/react-native-spokestack 📦

Spokestack: give your React Native app a voice interface!

Language: TypeScript - Size: 6.52 MB - Last synced at: 3 days ago - Pushed at: about 3 years ago - Stars: 61 - Forks: 13

nature-heart-software/izabela

Your speech assistant. Communicate with text-to-speech in games, on voice chat, on stream or simply on your speakers!

Language: Vue - Size: 133 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 38 - Forks: 6

Wikidepia/indonesian-tts

Indonesian TTS (text-to-speech) using Coqui TTS

Size: 11.7 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 73 - Forks: 8

janvarev/Irene-Voice-Assistant

Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.

Language: Python - Size: 109 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 903 - Forks: 128

ssb22/gradint

Graduated Interval Recall program

Language: Python - Size: 59.4 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 20 - Forks: 4

1nnovat1on/universalTranslator

Can translate to any language in real-time from any other on a Windows OS machine using speech recognition and text-to-speech

Language: Python - Size: 3.91 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

NVIDIA/DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Language: Jupyter Notebook - Size: 104 MB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 14,182 - Forks: 3,328

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language: Python - Size: 126 MB - Last synced at: 17 days ago - Pushed at: 28 days ago - Stars: 8,975 - Forks: 702

ZDisket/TensorVox

Desktop application for neural speech synthesis written in C++

Language: C++ - Size: 15.5 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 215 - Forks: 20

intelligentnode/IntelliNode

Access the latest AI models like ChatGPT, LLaMA, Deepseek, Diffusion, Hugging face, and beyond through a unified prompt layer and performance evaluation

Language: JavaScript - Size: 10 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 252 - Forks: 16

lmnt-com/wavegrad

A fast, high-quality neural vocoder.

Language: Python - Size: 18.6 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 284 - Forks: 48

01Zhangbw/Awesome-Expressive-speech-synthesis

This is a summary of Expressive speech synthesis papers. Now update: 23 April.

Size: 8.79 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 5 - Forks: 0

IEvangelist/learning-blazor

The application for the "Learning Blazor: Build Single Page Apps with WebAssembly and C#" O'Reilly Media book by David Pine.

Language: C# - Size: 7.47 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 134 - Forks: 42

marytts/marytts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Language: Java - Size: 143 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 2,460 - Forks: 752

HadrienGardeur/web-speech-recommended-voices

A list of recommended voices for the Web Speech API

Size: 271 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 36 - Forks: 6

leaonline/easy-speech

🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API

Language: JavaScript - Size: 1.12 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 229 - Forks: 24

01Zhangbw/Speech-and-audio-papers-Top-Conference

It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP2024, NAACL2025, AAAI2025

Size: 285 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 48 - Forks: 1

double22a/speech_dataset

The dataset of Speech Recognition

Size: 74.2 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 413 - Forks: 77

jneilliii/OctoPrint-M117SpeechSynthesis

Language: Jinja - Size: 115 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 5 - Forks: 1

mediatechlab/tts-wrapper

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.

Language: Python - Size: 546 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 21 - Forks: 9

arniery/andys-project

final assignment for the trinity SLP course "speech processing 2: acoustic modelling": cascade and parallel formant synthesis, the end goal being to produce vowels using both methods.

Language: Jupyter Notebook - Size: 664 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

Emotional-Text-to-Speech/dl-for-emo-tts

:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:

Language: Jupyter Notebook - Size: 5.26 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 447 - Forks: 44

lucasnewman/best-rq-pytorch

Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.

Language: Python - Size: 365 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 116 - Forks: 11

fabiolimace/espeak-ng-playground

Espaço para experimentação e desenvolvimento de melhorias para o `espeak-ng` focado no português brasileiro. Repositório principal: https://github.com/fabiolimace/espeak-ng/

Language: Awk - Size: 70.9 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

alexykn/TorchTS

A modern text to speech frontend for Kokoro-82M

Language: JavaScript - Size: 3.93 MB - Last synced at: about 16 hours ago - Pushed at: 20 days ago - Stars: 5 - Forks: 2

Edresson/YourTTS

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Language: Jupyter Notebook - Size: 410 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 966 - Forks: 84

libdriver/syn6988

SYN6988 full-featured driver library for general MCU and Linux.

Language: C - Size: 3.94 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 7 - Forks: 5

ictnlp/SLED-TTS

Streamable Text-to-Speech model using a language modeling approach, without vector quantization

Language: Python - Size: 251 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 1

mikeroyal/NLP-Guide

Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.

Language: Python - Size: 315 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

Agash/TTSTextNormalization

Modern .NET 9 / C# 13 library to normalize text (emojis, currency, numbers, abbreviations, chat slang) for consistent and natural Text-to-Speech (TTS) synthesis, ideal for stream chat/donations.

Language: C# - Size: 138 KB - Last synced at: 1 day ago - Pushed at: 22 days ago - Stars: 1 - Forks: 0

KennethanCeyer/awesome-audio-speech

Awesome list of Audio, Speech, and DSP(Digital signal processing)

Size: 847 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

DigitalPhonetics/IMS-Toucan

Controllable and fast Text-to-Speech for over 7000 languages!

Language: Python - Size: 21.3 MB - Last synced at: 21 days ago - Pushed at: 6 months ago - Stars: 1,583 - Forks: 180

energypatrikhu/voice-to-text-gui

Voice to text app made in Electron

Language: TypeScript - Size: 101 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

stefantaubert/mean-opinion-score

Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings according to Ribeiro et al. (2011).

Language: Python - Size: 77.1 KB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 23 - Forks: 1

stefantaubert/tacotron-cli

Command-line interface to train Tacotron 2 using .wav <=> .TextGrid pairs.

Language: Python - Size: 1.33 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 2

EasyAI-France/Audiobook-Simplifier

Audiobook Simplifier is a tool that creates audiobooks from text documents or eBooks using TTS (Text-to-Speech) technology.

Language: Python - Size: 145 KB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

milosgajdos/go-playht

PlayHT API client Go module

Language: Go - Size: 104 KB - Last synced at: 16 days ago - Pushed at: 23 days ago - Stars: 6 - Forks: 2

ghchinoy/fabulae

create audio stories from PDFs or existing transcripts

Language: Go - Size: 699 KB - Last synced at: 6 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0