GitHub topics: speech-synthesis
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language: Python - Size: 451 MB - Last synced at: about 10 hours ago - Pushed at: about 10 hours ago - Stars: 14,912 - Forks: 2,953

UKR-PROJECTS/chatterbox-tts-colab
Transform any text into natural-sounding speech, clone voices from audio samples, and create professional voiceovers - all running free in Google Colab!
Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: about 16 hours ago - Pushed at: about 16 hours ago - Stars: 1 - Forks: 0

espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language: C - Size: 72.4 MB - Last synced at: about 20 hours ago - Pushed at: 14 days ago - Stars: 5,211 - Forks: 1,043

TheVoxProject/calcvox
Accessible and open-source talking calculator for everyone.
Language: C++ - Size: 1010 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 1

voicepaw/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
Language: Python - Size: 20.2 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 9,044 - Forks: 1,202

Gmzxdotzz/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 571 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3 - Forks: 0

denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Language: Python - Size: 2.09 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 322 - Forks: 17

crispinprojects/talkcalendar
Talk Calendar is a personal desktop calendar for Linux which has some speech capability.
Language: C - Size: 234 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language: TypeScript - Size: 21.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 16,397 - Forks: 1,362

sine2pi/asr_model
NLP model with acoustic positional encoding.
Language: Python - Size: 638 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language: Python - Size: 299 KB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 4,076 - Forks: 462

Swap98-Coder/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

lmnt-com/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Language: Python - Size: 20.5 KB - Last synced at: about 5 hours ago - Pushed at: about 1 year ago - Stars: 844 - Forks: 119

Lyrcaxis/KokoroSharp
Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.
Language: C# - Size: 107 KB - Last synced at: 2 days ago - Pushed at: 16 days ago - Stars: 127 - Forks: 9

NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Language: Jupyter Notebook - Size: 104 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 14,346 - Forks: 3,349

ELANOELR/EchoForge-AI-Voice-Cloner-GUI
Offline AI voice cloning tool with real-time TTS GUI. No login. No GPU required. Perfect for content creators.
Language: Python - Size: 834 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 5 - Forks: 1

fabiolimace/espeak-ng-playground
Espaço para experimentação e desenvolvimento de melhorias para o `espeak-ng` focado no português brasileiro. Repositório principal: https://github.com/fabiolimace/espeak-ng/
Language: Awk - Size: 71.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language: Python - Size: 19.9 MB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 1,047 - Forks: 133

jim11662418/General_Instrument_CTS256_SP0256_Speech_Synthesizer
Vintage General Instrument Speech Synthesizer CTS256 with SP0256
Language: Assembly - Size: 15.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 9 - Forks: 2

mkiol/dsnote
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Language: C++ - Size: 76 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 937 - Forks: 39

Avatar-Home-Automation/A.V.A.T.A.R-Server
Agnostic Virtual Assistant for The Automated Residences
Language: JavaScript - Size: 22.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0

SocAIty/socaity
SDK for generative AI.
Language: Python - Size: 26.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

sanushka2025/Microsoft_Windows
Programs and tools for Windows.
Language: Python - Size: 9.42 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

echogarden-project/echogarden
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Language: TypeScript - Size: 2.4 MB - Last synced at: 3 days ago - Pushed at: 28 days ago - Stars: 373 - Forks: 40

rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language: Python - Size: 2.08 MB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 8,485 - Forks: 793

RHVoice/RHVoice
a free and open source speech synthesizer for Russian and other languages
Language: C++ - Size: 14.3 MB - Last synced at: about 20 hours ago - Pushed at: 10 days ago - Stars: 1,662 - Forks: 245

Samba250/Mars
Explore Mars, the fourth planet from the Sun, known for its reddish surface and intriguing geological features. 🚀 Join the mission to uncover its secrets and pave the way for future human exploration! 🌌
Size: 19.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Language: Python - Size: 110 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 2,978 - Forks: 956

ManimCommunity/manim-voiceover
Manim plugin for all things voiceover
Language: Python - Size: 879 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 226 - Forks: 55

NaomiProject/Naomi
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
Language: Python - Size: 5.27 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 278 - Forks: 60

WhisperSpeech/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language: Jupyter Notebook - Size: 38 MB - Last synced at: 6 days ago - Pushed at: 17 days ago - Stars: 4,286 - Forks: 240

sasawasewq/DubFlow
DubFlow is an AI tool that transforms YouTube videos into multiple languages, making content accessible to a wider audience. With features like automatic transcript extraction and natural-sounding speech generation, it simplifies the dubbing process for creators. 🐙✨
Language: JavaScript - Size: 3.03 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language: Python - Size: 69.2 MB - Last synced at: 6 days ago - Pushed at: 15 days ago - Stars: 11,996 - Forks: 1,920

ssb22/CedPane
Chinese-English Dictionary Public-domain Additions for Names Etc (CedPane)
Size: 35.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 1

Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language: Python - Size: 87.4 MB - Last synced at: 6 days ago - Pushed at: 15 days ago - Stars: 2,401 - Forks: 176

EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
Language: Python - Size: 9.25 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 35 - Forks: 2

r9y9/pysptk
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Language: Python - Size: 15.3 MB - Last synced at: about 11 hours ago - Pushed at: 11 months ago - Stars: 442 - Forks: 78

gexgd0419/NaturalVoiceSAPIAdapter
Make Azure natural TTS voices accessible to any SAPI 5-compatible application.
Language: C++ - Size: 27 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 366 - Forks: 22

SocAIty/SpeechCraft Fork of suno-ai/bark
🔊 Text2Speech, Voice-Cloning and Voice2Voice conversion with the text-prompted generative audio model bark
Language: Python - Size: 9.78 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 65 - Forks: 6

rhasspy/piper
A fast, local neural text to speech system
Language: C++ - Size: 208 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 9,351 - Forks: 734

KoljaB/RealtimeTTS
Converts text to speech in realtime
Language: Python - Size: 68.1 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 3,186 - Forks: 316

DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
Size: 11.4 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 674 - Forks: 42

alexykn/TorchTS
A modern text to speech frontend for Kokoro-82M
Language: JavaScript - Size: 4.13 MB - Last synced at: 1 day ago - Pushed at: 8 days ago - Stars: 5 - Forks: 2

gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language: Python - Size: 2.2 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 802 - Forks: 132

thorstenMueller/Thorsten-Voice
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
Language: Python - Size: 16.6 MB - Last synced at: 4 days ago - Pushed at: 6 months ago - Stars: 619 - Forks: 53

stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
Language: C# - Size: 77.6 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2,857 - Forks: 361

microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Language: Python - Size: 17.8 MB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 1,369 - Forks: 127

ictnlp/Stream-Omni
Stream-Omni is an end-to-end language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.
Language: Python - Size: 10.6 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

espnet/espnet
End-to-End Speech Processing Toolkit
Language: Python - Size: 1.15 GB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 9,202 - Forks: 2,278

leaonline/easy-speech
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API
Language: JavaScript - Size: 1.12 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 234 - Forks: 24

SanHacks/AiGen
Multi Model Personal Assistant Wrapper in Go: Interact with ChatGPT, Claude or Ollama Cross Platform (Speech & Image generation supported)
Language: Go - Size: 3.41 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 4

leminhnguyen/ai-speech-engineer-roadmap
A curated roadmap based on my 5 years of experience form zero to become a skilled AI Speech Engineer. This roadmap covers everything from fundamentals to cutting-edge research trends in the speech domain.
Size: 4.35 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 14 - Forks: 0

modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language: Python - Size: 1.46 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 409 - Forks: 32

tensorflow/lingvo
Lingvo
Language: Python - Size: 142 MB - Last synced at: 4 days ago - Pushed at: 6 days ago - Stars: 2,843 - Forks: 449

sdkcarlos/artyom.js
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Language: JavaScript - Size: 1.08 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 1,261 - Forks: 366

ssb22/gradint
Graduated Interval Recall program
Language: Python - Size: 59.4 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 20 - Forks: 4

KennethanCeyer/awesome-audio-speech
Awesome list of Audio, Speech, and DSP(Digital signal processing)
Size: 847 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 1

zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Size: 197 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 3,048 - Forks: 513

Migushthe2nd/MsEdgeTTS
A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API
Language: TypeScript - Size: 265 KB - Last synced at: 11 days ago - Pushed at: 6 months ago - Stars: 306 - Forks: 45

IEvangelist/learning-blazor
The application for the "Learning Blazor: Build Single Page Apps with WebAssembly and C#" O'Reilly Media book by David Pine.
Language: C# - Size: 7.47 MB - Last synced at: 7 days ago - Pushed at: 6 months ago - Stars: 135 - Forks: 41

NVIDIA/flowtron
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Language: Jupyter Notebook - Size: 2.76 MB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 899 - Forks: 176

Vaibhavs10/ml-with-audio
HF's ML for Audio study group
Language: Jupyter Notebook - Size: 5.12 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 192 - Forks: 29

JackismyShephard/ultimate-rvc
An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.
Language: Python - Size: 7.65 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 114 - Forks: 24

rhasspy/piper-samples
Samples for Piper text to speech system
Language: Python - Size: 573 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 6 - Forks: 3

kakaobrain/pororo 📦
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Language: Python - Size: 12.8 MB - Last synced at: 11 days ago - Pushed at: over 3 years ago - Stars: 1,297 - Forks: 223

DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
Language: Python - Size: 21.4 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 1,611 - Forks: 183

kosich/rxjs-tts
RxJS wrapper for Text-to-Speech Web API
Language: TypeScript - Size: 563 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3

mikeroyal/NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Language: Python - Size: 315 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 93 - Forks: 15

Shristirajpoot/CalcVoive
🎙️ Voice-enabled calculator built with React | Supports speech input/output & smart math parsing
Language: CSS - Size: 1.17 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 1 - Forks: 0

ryota-komatsu/speech_resynth
Speech Resynthesis and Language Modeling
Language: Python - Size: 4.86 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 17 - Forks: 4

alphacep/awesome-russian-speech
Russian speech technology links
Size: 134 KB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 309 - Forks: 22

michaelzhang-ai/Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".
Language: Python - Size: 209 MB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 436 - Forks: 94

csun22/Synthetic-Voice-Detection-Vocoder-Artifacts
This repository is related to our Dataset and Detection code from the paper: AI-Synthesized Voice Detection Using Neural Vocoder Artifacts accepted in CVPR Workshop on Media Forensic 2023.
Language: Python - Size: 183 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 121 - Forks: 14

estuary-ai/mangrove
Mangrove is the backend module of Estuary, a framework for building multimodal real-time Socially Intelligent Agents (SIAs).
Language: Python - Size: 2.16 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 11 - Forks: 2

yukiarimo/hanasu
Hanasu is a human-like TTS model based on the multilingual Himitsu V1 transformer-based encoder and VITS architecture
Language: Python - Size: 5.58 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 28 - Forks: 5

ZYiHu/EmoVoiceChatbot
Emotional voice chatbot with sentiment-based speech synthesis
Language: Python - Size: 21.5 KB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

oscie57/tiktok-voice
Simple Python script to interact with the TikTok TTS API
Language: Python - Size: 1.24 MB - Last synced at: 13 days ago - Pushed at: 9 months ago - Stars: 584 - Forks: 86

Steve0929/tiktok-tts
Provides a simple way to generate text-to-speech audio files using TikTok's text-to-speech (TTS) API in Node.js.
Language: JavaScript - Size: 628 KB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 90 - Forks: 9

devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.
Language: Python - Size: 18.5 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 212 - Forks: 36

Badri467/DubFlow
DubFlow lets you effortlessly dub YouTube videos into any language with high-quality translations and synced audio. Simply enter a YouTube URL, choose your target language, and get a dubbed video ready to share. Perfect for creators and viewers looking to break language barriers.
Language: JavaScript - Size: 120 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1 - Forks: 0

osteele/speech-provider
A unified TypeScript interface for browser speech synthesis and Eleven Labs TTS voices
Language: TypeScript - Size: 84 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

mapbox/mapbox-speech-swift
Natural-sounding text-to-speech in Swift or Objective-C on iOS, macOS, tvOS, and watchOS
Language: Swift - Size: 459 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 46 - Forks: 17

guest271314/SpeechSynthesisRecorder
Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types
Language: JavaScript - Size: 40 KB - Last synced at: 16 days ago - Pushed at: almost 7 years ago - Stars: 84 - Forks: 20

EasyAI-France/Audiobook-Simplifier
Audiobook Simplifier is a tool that creates audiobooks from text documents or eBooks using TTS (Text-to-Speech) technology.
Language: Python - Size: 70.3 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

intelligentnode/IntelliNode
Access the latest AI models like ChatGPT, LLaMA, Deepseek, Diffusion, Hugging face, and beyond through a unified prompt layer and performance evaluation
Language: JavaScript - Size: 10 MB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 259 - Forks: 16

IDEA-Emdoor-Lab/UniTTS
A TTS Trained on Universal Audio.
Language: Python - Size: 49.5 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 27 - Forks: 3

Troyanovsky/awesome-TTS-Colab
Collection of awesome TTS and voice cloning models to run with Google Colab
Language: Jupyter Notebook - Size: 1.13 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

giellalt/speech-smj
Speech language technology resources for the Julev Sámi language
Size: 160 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

AlekPet/ComfyUI_Custom_Nodes_AlekPet
Custom nodes that extend the capabilities of Comfyui
Language: JavaScript - Size: 12.9 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1,225 - Forks: 77

digitalplusplus/BaM-Construct
Unity OpenXR Multi-player VR Baseline with Speech & LLM Driven NPC
Language: C# - Size: 559 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 18 - Forks: 5

keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Language: Python - Size: 102 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 218 - Forks: 13

athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
Language: C++ - Size: 9.94 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 951 - Forks: 197

Johnmiicheal/spitch.js
Unofficial Javascript SDK for Spitch AI
Language: TypeScript - Size: 179 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

Wikidepia/indonesian-tts
Indonesian TTS (text-to-speech) using Coqui TTS
Size: 11.7 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 76 - Forks: 8

anormi001/chatterbox-tts-api
Chatterbox TTS API is a FastAPI-powered REST API designed for text-to-speech applications. It offers seamless integration and efficient performance, making it a great choice for developers looking to enhance their projects. ⭐️👩💻
Language: Python - Size: 253 KB - Last synced at: 20 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

DiffAPF/torchlpc
Fast and differentiable time domain all-pole filter in PyTorch.
Language: Python - Size: 85 KB - Last synced at: 7 days ago - Pushed at: 22 days ago - Stars: 62 - Forks: 4

egorsmkv/speech-recognition-uk
🇺🇦 Speech Recognition & Synthesis for Ukrainian
Language: Python - Size: 2.42 MB - Last synced at: 21 days ago - Pushed at: 29 days ago - Stars: 388 - Forks: 21

opendilab/CleanS2S
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
Language: Python - Size: 3.91 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 421 - Forks: 38

gooofy/zerovox
zero-shot realtime TTS system, fully offline, free and open source
Language: Python - Size: 38.9 MB - Last synced at: 11 days ago - Pushed at: 2 months ago - Stars: 41 - Forks: 5

mideind/Icespeak
Icelandic-language speech synthesis with Python
Language: Python - Size: 370 KB - Last synced at: 11 days ago - Pushed at: 23 days ago - Stars: 7 - Forks: 0
