GitHub topics: fastspeech2

Repositories

TensorSpeech/TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Language: Python - Size: 130 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 3,924 - Forks: 810

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language: Python - Size: 126 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 9,063 - Forks: 713

ZDisket/TensorVox

Desktop application for neural speech synthesis written in C++

Language: C++ - Size: 15.5 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 215 - Forks: 20

keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Language: Python - Size: 143 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 325 - Forks: 42

Adibian/ResGrad

Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech

Language: Python - Size: 2.23 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 15 - Forks: 4

keonlee9420/Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Language: Python - Size: 3.45 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 146 - Forks: 19

muhammadVohra787/speech-synthesis-app

The Speech Synthesis App converts text into natural-sounding speech using advanced models, providing an interactive platform for audio generation.

Language: Python - Size: 7.81 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

RALYHDB/ASV-spoofing

This repository contains the code and resources associated with my Bachelor's Thesis. The project evaluates the performance of various automatic speaker verification (ASV) systems against identity spoofing attacks generated using text-to-speech (TTS) synthesis technologies.

Language: Python - Size: 14.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

lars76/fastspeech2-clean

Clean and modernized implementation of FastSpeech2/LightSpeech using IPA

Language: Python - Size: 216 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

PaddlePaddle/Parakeet 📦

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Language: Python - Size: 9.32 MB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 600 - Forks: 83

xcmyz/FastSpeech2

The Implementation of FastSpeech2 Based on Pytorch.

Language: Python - Size: 4.03 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 52 - Forks: 8

mariatepei/VT_thesis_MTepei

This repository accompanies my MSc Thesis for the degree Voice Technology, storing all referenced data and other relevant resources.

Language: Jupyter Notebook - Size: 1.13 MB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

ranchlai/mandarin-tts

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder, with biaobei and aishell3 datasets

Language: Python - Size: 85.4 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 446 - Forks: 106

rishikksh20/FastSpeech2

PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech

Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 211 - Forks: 52

ssmlkl/MnTTS2

This is the experimental description of MnTTS2.

Language: Jupyter Notebook - Size: 39.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 2

rishikksh20/AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

Language: Jupyter Notebook - Size: 4.05 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 155 - Forks: 40

rishikksh20/LightSpeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Language: Python - Size: 3.23 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 77 - Forks: 7

utkarsh2299/Fastspeech2_HS

Created this repo as a part of the project "Speech Technologies in Indian languages". About Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving quality of synthesis, as well as small foot print TTS integrated with disability aids and various other applications.

Language: Perl - Size: 1.03 GB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dathudeptrai/FastSpeech2

A Tensorflow Implementation of the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Size: 7.14 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 11 - Forks: 0

lordzuko/FastSpeech2-jax

Implementation of FastSpeech2 in JAX

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

tuanh123789/AdaSpeech

An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"

Language: Python - Size: 50.4 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 80 - Forks: 23

ga642381/FastSpeech2

Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech :fist:

Language: Python - Size: 39.7 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 79 - Forks: 16

hwRG/End-to-End-TTS-Fine-Tune

Use FastSpeech2 and HiFi-GAN to easily perform end-to-end Korean speech synthesis.

Language: Python - Size: 33.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 17 - Forks: 7

lordzuko/SpeakingStyle

Aligning latent space of speaking style with human perception using a re-embedding strategy

Language: Jupyter Notebook - Size: 133 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

deepaudio/deepaudio-tts

Language: Python - Size: 362 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 2

alessandropec/data_driven_ai_voice_cloning

This repository contain the code of the main part of my master thesis degree at Politecnico di Torino in Data science & Engineering

Language: Python - Size: 268 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 1

quackson/DG_HW

homework for deep generation. Combine FastSpeech2 with different vocoders ⭐REFERENCE (modify origin repos): https://github.com/ming024/FastSpeech2 https://github.com/NVIDIA/waveglow https://github.com/mindslab-ai/univnet https://github.com/jik876/hifi-gan

Language: Python - Size: 34.1 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Related Keywords

fastspeech2 32 tts 18 text-to-speech 16 pytorch 12 speech-synthesis 11 fastspeech 7 hifi-gan 7 tacotron2 5 vocoder 4 multi-speaker 4 multi-speaker-tts 4 deep-learning 4 voice-cloning 3 real-time 3 speech 3 machine-learning 3 assistive-technology 2 artificial-intelligence 2 non-autoregressive 2 non-ar 2 neural-tts 2 vits 2 pytorch-lightning 2 waveglow 2 single-speaker 2 sota 2 adaspeech 2 tensorflow2 2 multiband-melgan 2 transformer 2 melgan 2 ultimate-tts 2 unsupervised 2 end-to-end 2 blizzard-challenge 1 fine-tune 1 vit 1 mel-gan 1 tacotron 1 tts-chinese 1 tts-hanzi 1 tts-engines 1 mongolian 1 pytorch-implementation 1 lightspeech 1 espnet 1 hs 1 hybrid-segment 1 indic-languages 1 tensorflow 1 jax 1 conditional-layer-norm 1 conditional-layer-normalization 1 voiceclone 1 speech-to-text 1 imagecaptioning 1 image-captioning 1 gpt-2 1 deeplearning 1 yolox 1 visually-impaired-people 1 ocr-recognition 1 ocr 1 easyocr 1 hearing-impaired-people 1 transfer-learning 1 korean 1 univnet 1 hifigan 1 zero-shot-learning 1 wavlm 1 speaker-verification 1 speaker-embeddings 1 generative-ai 1 ecapa-tdnn 1 ai 1 hydra 1 speaking-style 1 pytorch-distributeddataparallel 1 comprehensive 1 voice-synthesis 1 phoneme 1 mb-melgan 1 desktop 1 voice-conversion 1 vall-e 1 text-to-audio 1 singing-voice-conversion 1 naturalspeech2 1 music-generation 1 maskgct 1 emilia 1 audit 1 audioldm 1 audio-synthesis 1 audio-generation 1 zh-tts 1 tflite 1 parallel-wavegan 1 mobile-tts 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos