GitHub topics: speech-synthesis
Swap98-Coder/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Size: 1.95 KB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 0 - Forks: 0

NevilPatel01/RVC-WebUI-MacOS
Optimized Retrieval-based Voice Conversion WebUI for Apple Silicon Macs (M1/M2/M3). Real-time, high-quality voice conversion with an easy web interface. All models included!
Language: Python - Size: 981 KB - Last synced at: about 12 hours ago - Pushed at: about 13 hours ago - Stars: 1 - Forks: 0

netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python - Size: 3.67 MB - Last synced at: about 19 hours ago - Pushed at: 9 months ago - Stars: 7,953 - Forks: 685

abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language: Python - Size: 78 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,648 - Forks: 271

Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language: Python - Size: 4.07 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,063 - Forks: 80

EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
Language: Python - Size: 9.84 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 33 - Forks: 2

MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
Language: Jupyter Notebook - Size: 16.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 25 - Forks: 1

dragonhub0710/image-to-speech
A python project for converting an Image into audible sound using OCR and speech synthesis
Language: Python - Size: 160 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language: Python - Size: 299 KB - Last synced at: 2 days ago - Pushed at: 25 days ago - Stars: 4,011 - Forks: 441

DiffAPF/torchlpc
Fast and differentiable time domain all-pole filter in PyTorch.
Language: Python - Size: 69.3 KB - Last synced at: about 12 hours ago - Pushed at: 5 days ago - Stars: 61 - Forks: 4

espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language: C - Size: 72.7 MB - Last synced at: 2 days ago - Pushed at: 27 days ago - Stars: 5,032 - Forks: 1,009

AlekPet/ComfyUI_Custom_Nodes_AlekPet
Custom nodes that extend the capabilities of Comfyui
Language: JavaScript - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,183 - Forks: 74

stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
Language: C# - Size: 77.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,702 - Forks: 347

microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Language: Python - Size: 17.8 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 1,347 - Forks: 126

devnen/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 31.2 MB - Last synced at: 3 days ago - Pushed at: 6 days ago - Stars: 147 - Forks: 27

rhasspy/piper
A fast, local neural text to speech system
Language: C++ - Size: 208 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 8,826 - Forks: 681

ManimCommunity/manim-voiceover
Manim plugin for all things voiceover
Language: Python - Size: 879 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 217 - Forks: 46

leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language: TypeScript - Size: 21.3 MB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 16,220 - Forks: 1,347

thorstenMueller/Thorsten-Voice
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
Language: Python - Size: 16.6 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 610 - Forks: 53

amirivojdan/shekar
Simplifying Persian NLP for Everyone
Language: Python - Size: 467 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 1

rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language: Python - Size: 2.08 MB - Last synced at: 4 days ago - Pushed at: 7 days ago - Stars: 8,126 - Forks: 771

NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language: Python - Size: 435 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 13,794 - Forks: 2,806

coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language: Python - Size: 162 MB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 39,795 - Forks: 5,066

crispinprojects/talkcalendar
Talk Calendar is a personal desktop calendar for Linux which has some speech capability.
Language: C - Size: 136 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

baxtree/wiki2ssml
Wiki2SSML provides the WikiVoice markup language used for fine-tuning synthesised voice.
Language: JavaScript - Size: 396 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 35 - Forks: 0

RHVoice/RHVoice
a free and open source speech synthesizer for Russian and other languages
Language: C++ - Size: 14.3 MB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 1,639 - Forks: 242

WhisperSpeech/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language: Jupyter Notebook - Size: 38 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 4,233 - Forks: 235

Erangamadhushan/EM956-Community-Assistant
EM956 Community Assistant for EM956 Community Support Web portrail
Language: JavaScript - Size: 5.86 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

voicepaw/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
Language: Python - Size: 20.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8,992 - Forks: 1,197

Gmzxdotzz/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 572 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

kakaobrain/pororo 📦
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Language: Python - Size: 12.8 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 1,296 - Forks: 223

Lyrcaxis/KokoroSharp
Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.
Language: C# - Size: 159 KB - Last synced at: 5 days ago - Pushed at: about 2 months ago - Stars: 109 - Forks: 5

echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language: TypeScript - Size: 1.72 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 359 - Forks: 39

zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Size: 197 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 3,034 - Forks: 510

KoljaB/RealtimeTTS
Converts text to speech in realtime
Language: Python - Size: 68.1 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2,948 - Forks: 291

mmorise/World
A high-quality speech analysis, manipulation and synthesis system
Language: C++ - Size: 878 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 1,229 - Forks: 257

OpenVoiceOS/ovos-tts-plugin-cotovia
galician tts plugin for OVOS
Language: Python - Size: 1.51 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1

denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Language: Python - Size: 1.64 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 183 - Forks: 5

FolhaSP/mosaico
🎬 Open-source programmatic video composition framework with AI capabilities for Python
Language: Python - Size: 1.86 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 13 - Forks: 0

slp-rl/aero
This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)
Language: Python - Size: 159 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 221 - Forks: 31

scruss/micropython-SYN6988
MicroPython library for the VoiceTX SYN6988 text to speech module
Language: Python - Size: 1.22 MB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 15 - Forks: 1

andresayac/edge-tts
Edge TTS is a Node or Bun package that allows access to the online text-to-speech service used by Microsoft Edge without the need for Microsoft Edge, Windows, or an API key.
Language: TypeScript - Size: 42 KB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 49 - Forks: 1

Avatar-Home-Automation/A.V.A.T.A.R-Server
Agnostic Virtual Assistant for The Automated Residences
Language: JavaScript - Size: 22.5 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 2 - Forks: 0

athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
Language: C++ - Size: 9.94 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 947 - Forks: 189

PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language: Python - Size: 69.4 MB - Last synced at: 9 days ago - Pushed at: 19 days ago - Stars: 11,845 - Forks: 1,904

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.13 GB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 9,043 - Forks: 2,250

stefantaubert/mel-cepstral-distance
A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".
Language: Python - Size: 59.8 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 53 - Forks: 10

taigrr/elevenlabs
ElevenLabs Artificial Voice Synthesis Client
Language: Go - Size: 105 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 45 - Forks: 14

DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
Size: 11.4 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 667 - Forks: 42

snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Language: Jupyter Notebook - Size: 488 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 5,253 - Forks: 336

NaomiProject/Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Language: Python - Size: 5.27 MB - Last synced at: about 14 hours ago - Pushed at: 4 months ago - Stars: 274 - Forks: 60

CodersCreative/natural-tts
A rust crate for easily implementing Text-To-Speech into your rust programs.
Language: Rust - Size: 199 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 10 - Forks: 5

tensorflow/lingvo
Lingvo
Language: Python - Size: 142 MB - Last synced at: 2 days ago - Pushed at: 10 days ago - Stars: 2,838 - Forks: 451

mkiol/dsnote
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Language: C++ - Size: 74.6 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 859 - Forks: 35

Azure-Samples/Cognitive-Speech-TTS
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Language: C# - Size: 822 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 952 - Forks: 521

sidharthrajaram/StyleTTS2
🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
Language: Python - Size: 131 MB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 159 - Forks: 36

guest271314/SpeechSynthesisRecorder
Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types
Language: JavaScript - Size: 40 KB - Last synced at: 2 days ago - Pushed at: almost 7 years ago - Stars: 82 - Forks: 20

JackismyShephard/ultimate-rvc Fork of SociallyIneptWeeb/AICoverGen
An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.
Language: Python - Size: 7.35 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 81 - Forks: 23

rhasspy/piper-samples
Samples for Piper text to speech system
Language: Python - Size: 559 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 6 - Forks: 3

bshall/hifigan
An 16kHz implementation of HiFi-GAN for soft-vc.
Language: Python - Size: 101 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 99 - Forks: 25

SocAIty/SpeechCraft Fork of suno-ai/bark
🔊 Text2Speech, Voice-Cloning and Voice2Voice conversion with the text-prompted generative audio model bark
Language: Python - Size: 9.78 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 62 - Forks: 6

MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Language: Python - Size: 61.9 MB - Last synced at: 12 days ago - Pushed at: about 2 months ago - Stars: 4,475 - Forks: 739

spokestack/react-native-spokestack 📦
Spokestack: give your React Native app a voice interface!
Language: TypeScript - Size: 6.52 MB - Last synced at: 3 days ago - Pushed at: about 3 years ago - Stars: 61 - Forks: 13

nature-heart-software/izabela
Your speech assistant. Communicate with text-to-speech in games, on voice chat, on stream or simply on your speakers!
Language: Vue - Size: 133 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 38 - Forks: 6

Wikidepia/indonesian-tts
Indonesian TTS (text-to-speech) using Coqui TTS
Size: 11.7 KB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 73 - Forks: 8

janvarev/Irene-Voice-Assistant
Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.
Language: Python - Size: 109 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 903 - Forks: 128

ssb22/gradint
Graduated Interval Recall program
Language: Python - Size: 59.4 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 20 - Forks: 4

1nnovat1on/universalTranslator
Can translate to any language in real-time from any other on a Windows OS machine using speech recognition and text-to-speech
Language: Python - Size: 3.91 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 16 days ago - Pushed at: 9 months ago - Stars: 14,182 - Forks: 3,328

open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language: Python - Size: 126 MB - Last synced at: 17 days ago - Pushed at: 28 days ago - Stars: 8,975 - Forks: 702

ZDisket/TensorVox
Desktop application for neural speech synthesis written in C++
Language: C++ - Size: 15.5 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 215 - Forks: 20

intelligentnode/IntelliNode
Access the latest AI models like ChatGPT, LLaMA, Deepseek, Diffusion, Hugging face, and beyond through a unified prompt layer and performance evaluation
Language: JavaScript - Size: 10 MB - Last synced at: 4 days ago - Pushed at: 2 months ago - Stars: 252 - Forks: 16

lmnt-com/wavegrad
A fast, high-quality neural vocoder.
Language: Python - Size: 18.6 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 284 - Forks: 48

01Zhangbw/Awesome-Expressive-speech-synthesis
This is a summary of Expressive speech synthesis papers. Now update: 23 April.
Size: 8.79 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 5 - Forks: 0

IEvangelist/learning-blazor
The application for the "Learning Blazor: Build Single Page Apps with WebAssembly and C#" O'Reilly Media book by David Pine.
Language: C# - Size: 7.47 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 134 - Forks: 42

marytts/marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Language: Java - Size: 143 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 2,460 - Forks: 752

HadrienGardeur/web-speech-recommended-voices
A list of recommended voices for the Web Speech API
Size: 271 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 36 - Forks: 6

leaonline/easy-speech
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API
Language: JavaScript - Size: 1.12 MB - Last synced at: 6 days ago - Pushed at: 10 days ago - Stars: 229 - Forks: 24

01Zhangbw/Speech-and-audio-papers-Top-Conference
It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP2024, NAACL2025, AAAI2025
Size: 285 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 48 - Forks: 1

double22a/speech_dataset
The dataset of Speech Recognition
Size: 74.2 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 413 - Forks: 77

jneilliii/OctoPrint-M117SpeechSynthesis
Language: Jinja - Size: 115 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 5 - Forks: 1

mediatechlab/tts-wrapper
TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.
Language: Python - Size: 546 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 21 - Forks: 9

arniery/andys-project
final assignment for the trinity SLP course "speech processing 2: acoustic modelling": cascade and parallel formant synthesis, the end goal being to produce vowels using both methods.
Language: Jupyter Notebook - Size: 664 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

Emotional-Text-to-Speech/dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
Language: Jupyter Notebook - Size: 5.26 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 447 - Forks: 44

lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
Language: Python - Size: 365 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 116 - Forks: 11

fabiolimace/espeak-ng-playground
Espaço para experimentação e desenvolvimento de melhorias para o `espeak-ng` focado no português brasileiro. Repositório principal: https://github.com/fabiolimace/espeak-ng/
Language: Awk - Size: 70.9 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

alexykn/TorchTS
A modern text to speech frontend for Kokoro-82M
Language: JavaScript - Size: 3.93 MB - Last synced at: about 16 hours ago - Pushed at: 20 days ago - Stars: 5 - Forks: 2

Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Language: Jupyter Notebook - Size: 410 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 966 - Forks: 84

libdriver/syn6988
SYN6988 full-featured driver library for general MCU and Linux.
Language: C - Size: 3.94 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 7 - Forks: 5

ictnlp/SLED-TTS
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
Language: Python - Size: 251 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 2 - Forks: 1

mikeroyal/NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Language: Python - Size: 315 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

Agash/TTSTextNormalization
Modern .NET 9 / C# 13 library to normalize text (emojis, currency, numbers, abbreviations, chat slang) for consistent and natural Text-to-Speech (TTS) synthesis, ideal for stream chat/donations.
Language: C# - Size: 138 KB - Last synced at: 1 day ago - Pushed at: 22 days ago - Stars: 1 - Forks: 0

KennethanCeyer/awesome-audio-speech
Awesome list of Audio, Speech, and DSP(Digital signal processing)
Size: 847 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
Language: Python - Size: 21.3 MB - Last synced at: 21 days ago - Pushed at: 6 months ago - Stars: 1,583 - Forks: 180

energypatrikhu/voice-to-text-gui
Voice to text app made in Electron
Language: TypeScript - Size: 101 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

stefantaubert/mean-opinion-score
Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings according to Ribeiro et al. (2011).
Language: Python - Size: 77.1 KB - Last synced at: 13 days ago - Pushed at: 3 months ago - Stars: 23 - Forks: 1

stefantaubert/tacotron-cli
Command-line interface to train Tacotron 2 using .wav <=> .TextGrid pairs.
Language: Python - Size: 1.33 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 2

EasyAI-France/Audiobook-Simplifier
Audiobook Simplifier is a tool that creates audiobooks from text documents or eBooks using TTS (Text-to-Speech) technology.
Language: Python - Size: 145 KB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

milosgajdos/go-playht
PlayHT API client Go module
Language: Go - Size: 104 KB - Last synced at: 16 days ago - Pushed at: 23 days ago - Stars: 6 - Forks: 2

ghchinoy/fabulae
create audio stories from PDFs or existing transcripts
Language: Go - Size: 699 KB - Last synced at: 6 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0
