An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: non-autoregressive

keonlee9420/Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Language: Python - Size: 101 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 193 - Forks: 27

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Language: Python - Size: 18.2 MB - Last synced at: 12 days ago - Pushed at: 8 months ago - Stars: 1,053 - Forks: 80

ducnt18121997/Viet-Transformer-TTS

This is PyTorch Implementation of A Non-Autoregressive Transformer with unsupervised learning durations based on Transformer & Conformer blocks, supporting for Vietnamese language.

Language: Python - Size: 170 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 1

lucidrains/soundstorm-pytorch

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Language: Python - Size: 264 KB - Last synced at: 11 days ago - Pushed at: 6 months ago - Stars: 1,489 - Forks: 91

keonlee9420/DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Language: Python - Size: 121 MB - Last synced at: 8 days ago - Pushed at: about 3 years ago - Stars: 331 - Forks: 45

mahshid1378/Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Language: Python - Size: 99.3 MB - Last synced at: 13 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

ictnlp/NAST-S2x

A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.

Language: Python - Size: 213 KB - Last synced at: 21 days ago - Pushed at: 6 months ago - Stars: 65 - Forks: 5

HKUNLP/DiffuSearch

[ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"

Language: Python - Size: 23.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 0

keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Language: Python - Size: 143 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 325 - Forks: 42

keonlee9420/Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Language: Python - Size: 3.45 MB - Last synced at: 8 days ago - Pushed at: almost 3 years ago - Stars: 146 - Forks: 19

henry-yeh/GLOP

[AAAI 2024] GLOP: Learning Global Partition and Local Construction for Solving Large-scale Routing Problems in Real-time

Language: Python - Size: 1.21 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 79 - Forks: 11

shivammehta25/Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Language: Jupyter Notebook - Size: 57.6 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 819 - Forks: 104

keonlee9420/DiffSinger

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

Language: Python - Size: 133 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 233 - Forks: 30

keonlee9420/PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Language: Python - Size: 129 MB - Last synced at: 5 months ago - Pushed at: about 3 years ago - Stars: 331 - Forks: 36

keonlee9420/Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Language: Python - Size: 99.3 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 189 - Forks: 45

keonlee9420/VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Language: Python - Size: 122 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 72 - Forks: 14

jxzhangjhu/awesome-LLM-controlled-decoding-generation

awesome-LLM-controlled-constrained-generation

Size: 120 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 9 - Forks: 0

HKUNLP/diffusion-of-thoughts

Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"

Language: Python - Size: 4.05 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 0

aistairc/BERT-NAR-BERT

BERT-based pre-trained non-autoregressive sequence-to-sequence model

Language: Python - Size: 74.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

keonlee9420/Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Language: Python - Size: 101 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 237 - Forks: 38

hemingkx/SpecDec

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Language: Python - Size: 7.22 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 0

keonlee9420/DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023 (Oral)

Language: Python - Size: 102 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 155 - Forks: 13

keonlee9420/FastPitchFormant

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

Language: Python - Size: 101 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 69 - Forks: 13

keonlee9420/StyleSpeech

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Language: Python - Size: 114 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 172 - Forks: 21

yzhangcs/ctc-copy

[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".

Language: Python - Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

RistoAle97/ContinualNAT

M.Sc. thesis on Continual Learning for multilingual non-autoregressive Neural Machine Translation (NAT)

Language: Python - Size: 3.76 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

keonlee9420/WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Language: Python - Size: 18 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 14

xcfcode/What-I-Have-Read

Paper Lists, Notes and Slides, Focus on NLP. For summarization, please refer to https://github.com/xcfcode/Summarization-Papers

Size: 91.2 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 161 - Forks: 15

LARC-CMU-SMU/Enconter

Implementation of 2021 EACL paper Enconter

Language: Jupyter Notebook - Size: 945 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

HKUNLP/reparam-discrete-diffusion

Reparameterized Discrete Diffusion Models for Text Generation

Language: Python - Size: 6.73 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 28 - Forks: 1

bearcatt/LaBERT

A length-controllable and non-autoregressive image captioning model.

Language: Python - Size: 34.2 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 56 - Forks: 10

keonlee9420/Daft-Exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Language: Python - Size: 110 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 48 - Forks: 14

keonlee9420/Deep-Learning-TTS-Template

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Language: Python - Size: 106 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 0

kan-bayashi/NonARSeq2SeqVC

Non-autoregressive sequence-to-sequence voice conversion

Size: 4.51 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

Related Keywords
non-autoregressive 34 tts 18 text-to-speech 18 speech-synthesis 18 pytorch 16 neural-tts 14 fastspeech 9 diffusion-models 6 non-ar 6 english 5 hifi-gan 5 deep-learning 5 end-to-end 4 natural-language-processing 4 vae 4 duration 4 self-attention 3 machine-learning 3 transformer 3 parallel-tacotron 3 generative-model 3 deep-neural-networks 3 nlp 2 unsupervised-learning 2 single-speaker 2 meta-learning 2 ddpm 2 parallel-tacotron2 2 diffusion 2 gan 2 graph-neural-networks 2 multi-speaker 2 mel-gan 2 fastspeech2 2 text-generation 2 sota 2 style 2 machine-translation 2 simultaneous-translation 2 ultimate-tts 2 language-model 2 summarization 2 llms 2 conversational-tts 2 unsupervised 2 text-to-audio 2 speaker 2 prosody 2 timbre 1 speaker-adaptation 1 meta-stylespeech 1 one-shot 1 neuro-symbolic 1 chain-of-thought-reasoning 1 diffusion-lm 1 mathematical-reasoning 1 bert 1 language-modelling 1 question-answering 1 sequence-to-sequence 1 conversational-speech-synthesis 1 emotional-speech-synthesis 1 emotional-tts 1 expressive-speech-synthesis 1 expressive-tts 1 korean-speech-synthesis 1 korean-tts 1 speculative-decoding 1 conversational-ai 1 conversational-data 1 dataset 1 tts-dataset 1 fastpitch 1 pitch 1 pitch-control 1 emnlp 1 generation 1 gnn 1 knowledge-distillation 1 naacl 1 notes 1 presentation 1 presentations 1 pretrain 1 slides 1 nlg 1 fairseq 1 python3 1 controllable-image-captioning 1 eccv2020 1 image-captioning 1 gaussian-upsampling 1 prosody-transfer 1 template 1 voice-conversion 1 speech-style 1 stylespeech 1 unseen-speaker 1 ctc 1 text-editing 1