GitHub topics: speech-synthesis
ssb22/CedPane
Chinese-English Dictionary Public-domain Additions for Names Etc (CedPane)
Size: 36.2 MB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 5 - Forks: 1

asd123fsdg/Crank
🎥 Automate YouTube Shorts creation with customizable prompts, titles, and tags for fast and efficient content generation.
Language: Python - Size: 17.1 MB - Last synced at: about 13 hours ago - Pushed at: about 16 hours ago - Stars: 0 - Forks: 0

Ahmed-Maher77/EchoText__text-to-speech-transformer
A modern text-to-speech web app that transforms written text into natural speech with a sleek glass-morphism UI, smooth animations, and cross-browser compatibility.
Language: CSS - Size: 17.6 KB - Last synced at: about 16 hours ago - Pushed at: about 17 hours ago - Stars: 0 - Forks: 0

soldier444xd/KittenTTS
KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fast start. 😺
Language: Python - Size: 17.6 KB - Last synced at: about 17 hours ago - Pushed at: about 19 hours ago - Stars: 2 - Forks: 0

camcar1/avone
Simplified version of avatar for easy integration and customization. Enhance your projects with this lightweight solution. 🌟👤
Language: HTML - Size: 117 KB - Last synced at: about 20 hours ago - Pushed at: about 23 hours ago - Stars: 0 - Forks: 0

sanushka2025/Microsoft_Windows
Programs and tools for Windows.
Language: Python - Size: 9.42 MB - Last synced at: about 21 hours ago - Pushed at: about 23 hours ago - Stars: 0 - Forks: 0

Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language: Python - Size: 88 MB - Last synced at: about 19 hours ago - Pushed at: 6 days ago - Stars: 2,640 - Forks: 205

Gmzxdotzz/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 571 KB - Last synced at: about 21 hours ago - Pushed at: about 23 hours ago - Stars: 2 - Forks: 1

rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language: Python - Size: 2.09 MB - Last synced at: about 9 hours ago - Pushed at: 10 days ago - Stars: 9,005 - Forks: 838

leminhnguyen/ai-speech-engineer-roadmap
A curated roadmap based on my 5 years of experience form zero to become a skilled AI Speech Engineer. This roadmap covers everything from fundamentals to cutting-edge research trends in the speech domain.
Size: 4.37 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 19 - Forks: 0

Mekhane2005/chatterbox
🎤 Create lifelike speech with Chatterbox, Resemble AI's open-source TTS model for seamless text-to-speech integration and enhanced user experiences.
Language: Python - Size: 1.39 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language: Python - Size: 69.3 MB - Last synced at: about 12 hours ago - Pushed at: 4 days ago - Stars: 12,218 - Forks: 1,939

JackismyShephard/ultimate-rvc
An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.
Language: Python - Size: 7.74 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 147 - Forks: 34

coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language: Python - Size: 162 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 42,427 - Forks: 5,566

Pearlssx/FireRedTTS2
🔊 Generate long-form streaming TTS for multi-speaker dialogues, enhancing conversations with natural-sounding voices and improved engagement.
Size: 1.29 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

WhisperSpeech/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language: Jupyter Notebook - Size: 38 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 4,357 - Forks: 249

andresayac/edge-tts
Edge TTS is a Node or Bun package that allows access to the online text-to-speech service used by Microsoft Edge without the need for Microsoft Edge, Windows, or an API key.
Language: TypeScript - Size: 85.9 KB - Last synced at: about 13 hours ago - Pushed at: 13 days ago - Stars: 83 - Forks: 12

leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language: TypeScript - Size: 21.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 16,620 - Forks: 1,376

NVIDIA-NeMo/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language: Python - Size: 459 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 15,612 - Forks: 3,084

egorsmkv/speech-recognition-uk
🇺🇦 Speech Recognition & Synthesis for Ukrainian
Language: Python - Size: 2.42 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 405 - Forks: 22

discordier/sam Fork of s-macke/SAM
Software Automatic Mouth - Tiny Speech Synthesizer
Language: JavaScript - Size: 6.15 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 674 - Forks: 79

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.22 GB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 9,431 - Forks: 2,325

alexniemiz1/listnr
🎵 Enjoy a modern terminal-based music player that supports multiple audio formats and offers intuitive controls for a seamless listening experience.
Language: Go - Size: 36.1 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

stefantaubert/pinyin-to-ipa
Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.
Language: Python - Size: 157 KB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 49 - Forks: 10

RageAgainstThePixel/ElevenLabs-DotNet
A Non-Official ElevenLabs RESTful API Client for dotnet
Language: C# - Size: 2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 70 - Forks: 25

andresayac/edge-tts-php
Edge TTS is a PHP package that allows access to the online text-to-speech service used by Microsoft Edge without the need for Microsoft Edge, Windows, or an API key.
Language: PHP - Size: 95.7 KB - Last synced at: about 13 hours ago - Pushed at: 4 days ago - Stars: 12 - Forks: 4

ssb22/gradint
Graduated Interval Recall tool for vocabulary practice
Language: Python - Size: 59.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 20 - Forks: 4

HumeAI/hume-typescript-sdk
Add Hume AI to any TypeScript project
Language: TypeScript - Size: 3.45 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 65 - Forks: 16

NaomiProject/Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Language: Python - Size: 5.36 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 282 - Forks: 60

Azure-Samples/Cognitive-Speech-TTS
Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.
Language: C# - Size: 822 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 980 - Forks: 537

thorstenMueller/Thorsten-Voice
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
Language: Python - Size: 16.6 MB - Last synced at: 2 days ago - Pushed at: 8 months ago - Stars: 648 - Forks: 55

denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Language: Python - Size: 4.08 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 3,205 - Forks: 161

stefantaubert/en-tts
Command-line interface and Python library for synthesizing English texts into speech.
Language: Python - Size: 805 KB - Last synced at: 2 days ago - Pushed at: 6 days ago - Stars: 4 - Forks: 1

Chris10M/Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
Language: Python - Size: 12.2 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 89 - Forks: 21

Lyrcaxis/KokoroSharp
Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.
Language: C# - Size: 123 KB - Last synced at: about 15 hours ago - Pushed at: 8 days ago - Stars: 159 - Forks: 16

fabiolimace/espeak-br Fork of espeak-ng/espeak-ng
A fork of eSpeak NG to improve its support for Brazilian Portuguese 🇧🇷. Experimentation repository: https://github.com/fabiolimace/espeak-playground/
Language: C - Size: 68.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

brandon-rezko/HeyGem
HeyGem — Your AI face, made free
Language: C - Size: 122 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 37 - Forks: 7

RHVoice/RHVoice
a free and open source speech synthesizer for Russian and other languages
Language: C++ - Size: 14.3 MB - Last synced at: 6 days ago - Pushed at: 9 days ago - Stars: 1,696 - Forks: 250

versevo-ai/versevo-ai
Language: Python - Size: 5.1 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 4

espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language: C - Size: 73 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 5,484 - Forks: 1,093

echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language: TypeScript - Size: 2.55 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 392 - Forks: 41

haoheliu/voicefixer
General Speech Restoration
Language: Python - Size: 3.76 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 1,209 - Forks: 147

Arnav-Sharmaa/Multilingual-Speech-to-Text-and-Speech-to-Speech-Content-Summarization-for-Indian-Languages
This project presents a multilingual pipeline for both speech-to-text and speech-to-speech summarization in Indian languages. It transcribes audio using a fine-tuned Whisper ASR model, summarizes text with mT5, and optionally synthesizes the summary back into speech using Indic Parler-TTS.
Language: Jupyter Notebook - Size: 3.94 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 1

Emotional-Text-to-Speech/dl-for-emo-tts
:computer: :robot: A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech :speaker:
Language: Jupyter Notebook - Size: 5.26 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 455 - Forks: 44

voicepaw/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
Language: Python - Size: 20.2 MB - Last synced at: 7 days ago - Pushed at: 21 days ago - Stars: 9,102 - Forks: 1,219

zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Size: 197 KB - Last synced at: 7 days ago - Pushed at: almost 2 years ago - Stars: 3,064 - Forks: 514

athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
Language: C++ - Size: 9.94 MB - Last synced at: 6 days ago - Pushed at: almost 3 years ago - Stars: 957 - Forks: 201

Migushthe2nd/MsEdgeTTS
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
Language: TypeScript - Size: 265 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 314 - Forks: 47

microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Language: Python - Size: 17.8 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 1,395 - Forks: 126

lmnt-com/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Language: Python - Size: 20.5 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 860 - Forks: 126

ALERTua/styletts2-ukrainian-openai-tts-api
OpenAI TTS Compatible Ukrainian TTS StyleTTS2 Pipeline
Language: Python - Size: 222 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 22 - Forks: 2

keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Language: Python - Size: 121 MB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 337 - Forks: 45

AlekPet/ComfyUI_Custom_Nodes_AlekPet
Custom nodes that extend the capabilities of Comfyui
Language: JavaScript - Size: 12.9 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 1,334 - Forks: 85

NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language: Python - Size: 19.9 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 1,090 - Forks: 139

huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language: Python - Size: 299 KB - Last synced at: 10 days ago - Pushed at: 5 months ago - Stars: 4,155 - Forks: 470

mkiol/dsnote
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Language: C++ - Size: 74.7 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 1,080 - Forks: 42

netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python - Size: 3.67 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 8,280 - Forks: 724

RaduBolbo/F5-TTS-Emotional-CFG
Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTS
Language: Python - Size: 1.05 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 14,460 - Forks: 3,366

Wikidepia/indonesian-tts
Indonesian TTS (text-to-speech) using Coqui TTS
Size: 11.7 KB - Last synced at: 6 days ago - Pushed at: about 3 years ago - Stars: 80 - Forks: 9

NVIDIA/OpenSeq2Seq 📦
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Language: Python - Size: 57.4 MB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 1,561 - Forks: 371

pawurb/termit 📦
Translations with speech synthesis in your terminal as a ruby gem
Language: Ruby - Size: 413 KB - Last synced at: 6 days ago - Pushed at: over 8 years ago - Stars: 507 - Forks: 20

stokito/FreeTTSLog4JAppender
Talking log appender to keep your eyes :)
Language: Java - Size: 141 KB - Last synced at: 5 days ago - Pushed at: over 11 years ago - Stars: 10 - Forks: 1

alphacep/awesome-russian-speech
Russian speech technology links
Size: 43.9 KB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 335 - Forks: 22

spokestack/spokestack-python 📦
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.
Language: Python - Size: 6.7 MB - Last synced at: 10 days ago - Pushed at: almost 4 years ago - Stars: 140 - Forks: 14

tensorflow/lingvo
Lingvo
Language: Python - Size: 142 MB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 2,853 - Forks: 451

keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Language: Python - Size: 110 KB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 2,982 - Forks: 955

ldilley/frank
:bulb: Fairly Rational Artificial Neural Kludge
Language: Java - Size: 76 MB - Last synced at: 10 days ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 0

FlyingFathead/huuda
Finnish TTS (text-to-speech) framework with Finglish capabilities
Language: Python - Size: 229 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

google/voice-builder
An opensource text-to-speech (TTS) voice building tool
Language: JavaScript - Size: 490 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 680 - Forks: 138

mastashake08/speech-kit
Simplifying the Speech Synthesis and Speech Recognition engines for Javascript. Listen for commands and perform callback actions, make the browser speak and transcribe your speech!
Language: JavaScript - Size: 239 KB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 6 - Forks: 1

rhasspy/piper
A fast, local neural text to speech system
Language: C++ - Size: 208 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 9,909 - Forks: 804

daslearning-org/text-to-speech-offline
A lightweight cross-platform Text-To-Speech application which works on Android native TTS and uses PiperTTS on desktop environment. The app looks like a simple chatbot made on kivy & kivymd which are based on Python. This app can also work Offline depending on the selected models. We will also include Speech-To-Text feature soon.
Language: Python - Size: 3.34 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

Troyanovsky/awesome-TTS-Colab
Collection of awesome TTS and voice cloning models to run with Google Colab
Language: Jupyter Notebook - Size: 2.22 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 9 - Forks: 0

modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language: Python - Size: 1.46 MB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 425 - Forks: 32

iamycy/diffwave-sr
Language: Jupyter Notebook - Size: 256 MB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 83 - Forks: 9

espnet/espnet_tts_frontend
Text frontend for ESPnet tts recipes
Language: Python - Size: 61.5 KB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 33 - Forks: 14

LauraKokkarinen/AzureAI.SpeechConcat
Convert long plain-text files into speech using the Azure AI Speech service.
Language: C# - Size: 177 MB - Last synced at: 6 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 1

aristech-de/tts-clients
Clients to communicate with the Aristech TTS service
Language: Python - Size: 137 KB - Last synced at: 1 day ago - Pushed at: 13 days ago - Stars: 3 - Forks: 0

lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language: Python - Size: 512 KB - Last synced at: 10 days ago - Pushed at: almost 2 years ago - Stars: 1,326 - Forks: 105

Johnmiicheal/spitch.js
Unofficial Javascript SDK for Spitch AI
Language: TypeScript - Size: 237 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language: Python - Size: 2.2 MB - Last synced at: 13 days ago - Pushed at: 9 months ago - Stars: 838 - Forks: 138

EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
Language: Python - Size: 10.8 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 40 - Forks: 2

stefantaubert/mel-cepstral-distance
A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based on the method proposed by Robert F. Kubichek in "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment".
Language: Python - Size: 62.7 MB - Last synced at: 4 days ago - Pushed at: 14 days ago - Stars: 55 - Forks: 10

SamThinks-Com/Kitten-TTS-Server
Self-host Kitten-TTS-Server 🐱: lightweight, high-performance TTS with GPU acceleration, Web UI and API; handles long texts and runs on NVIDIA GPUs, RPi4/5 and CPU.
Language: Python - Size: 337 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 2 - Forks: 1

hi-paris/Prosody-Control-French-TTS
An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control
Language: Python - Size: 13.3 MB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 12 - Forks: 1

r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Language: Python - Size: 6.78 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 1,980 - Forks: 487

Sambit003/versevo-ai Fork of versevo-ai/versevo-ai
Language: Python - Size: 583 KB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

shwetha-17-9/Deepfake_webpage_narration
A project that generates deepfake-style webpage narrations and answers user queries using AI-based text and voice synthesis.
Size: 3.39 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

KoljaB/RealtimeTTS
Converts text to speech in realtime
Language: Python - Size: 69.1 MB - Last synced at: 15 days ago - Pushed at: about 2 months ago - Stars: 3,420 - Forks: 333

jiaqili3/DualCodec
A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec for Speech Generation
Language: Jupyter Notebook - Size: 10.2 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 46 - Forks: 6

mikeroyal/NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Language: Python - Size: 315 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 99 - Forks: 16

michaelzhang-ai/Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".
Language: Python - Size: 209 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 439 - Forks: 93

FranciscoTC9999/abogen
🔊 Convert text to speech effortlessly with Abogen, a robust tool that supports multiple operating systems for clear and natural voice outputs.
Language: Python - Size: 2.13 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

soffiee32/OtosakuTTS-iOS
🗣️ Generate natural-sounding speech on iOS devices with this Swift library using on-device text-to-speech synthesis, ensuring privacy and fast performance.
Language: Swift - Size: 18.6 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

leaonline/easy-speech
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API
Language: JavaScript - Size: 1.12 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 241 - Forks: 23

joetansey1/voice_cloning
Zero-shot voice cloning web app + FastAPI API using Coqui XTTS v2
Language: HTML - Size: 119 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

rhasspy/piper-samples
Samples for Piper text to speech system
Language: Python - Size: 574 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 9 - Forks: 5

Swap98-Coder/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Size: 1.95 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 4 - Forks: 0

janvarev/Irene-Voice-Assistant
Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.
Language: Python - Size: 115 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1,046 - Forks: 137
