Topic: "text-to-audio"
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language: Python - Size: 126 MB - Last synced at: 4 days ago - Pushed at: 15 days ago - Stars: 8,975 - Forks: 702

hkchengrex/MMAudio
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Language: Python - Size: 9.07 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,388 - Forks: 165

declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language: Python - Size: 19.5 MB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 1,086 - Forks: 88

ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language: Python - Size: 18.2 MB - Last synced at: 18 days ago - Pushed at: 8 months ago - Stars: 1,053 - Forks: 80

gitmylo/audio-webui
A webui for different audio related Neural Networks
Language: Python - Size: 715 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 994 - Forks: 92

declare-lab/TangoFlux
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 664 - Forks: 59

Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
Language: Python - Size: 961 KB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 646 - Forks: 87

lucidrains/nuwa-pytorch
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
Language: Python - Size: 1.78 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 546 - Forks: 56

ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
Language: Python - Size: 2.12 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 529 - Forks: 52

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Language: HTML - Size: 12.7 MB - Last synced at: 12 days ago - Pushed at: 23 days ago - Stars: 455 - Forks: 26

AMAAI-Lab/mustango
Mustango: Toward Controllable Text-to-Music Generation
Language: Python - Size: 54.2 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 357 - Forks: 29

happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
Language: Jupyter Notebook - Size: 23.9 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 156 - Forks: 12

ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Language: Python - Size: 1.15 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 119 - Forks: 15

bnsantoso/sub-to-audio
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.
Language: Python - Size: 99.6 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 105 - Forks: 13

sony/soundctm
Pytorch implementation of SoundCTM
Language: Python - Size: 3.76 MB - Last synced at: 20 days ago - Pushed at: 27 days ago - Stars: 86 - Forks: 7

keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Language: Python - Size: 18 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 14

serp-ai/ai-text-to-audio-latent-diffusion Fork of Harmonai-org/sample-generator
text-to-audio-latent-diffusion
Language: Python - Size: 58.2 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 8

RhythrosaLabs/soundstorm
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
Language: Python - Size: 3.39 MB - Last synced at: 21 days ago - Pushed at: 12 months ago - Stars: 32 - Forks: 8

PapayaResearch/ctag
Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24
Language: Python - Size: 109 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 21 - Forks: 2

camenduru/audioldm-colab
AudioLDM text to audio colab
Language: Jupyter Notebook - Size: 24.4 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 3

kennethleungty/Text-to-Audio-with-Bark
Exploring Bark, the Open-Source Text-to-Audio Generative Model
Language: Jupyter Notebook - Size: 2.67 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 4

hkchengrex/av-benchmark
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
Language: Python - Size: 106 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 0

XiaomingX/awesome-ai-tools-for-game-dev
Awesome AI Tools for Game Development: A curated collection of the best AI tools, libraries, and resources to enhance game development workflows. From procedural content generation to NPC behavior, this repository gathers state-of-the-art AI solutions for game developers.
Size: 8.79 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 10 - Forks: 0

Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language: HTML - Size: 151 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 0

ahsplore/TalkitOut-TTS-web-application-python
TalkItOut is a Python and Flask-based web application that can convert text to speech, choose your preferred language for audio output, access a built-in dictionary for word meanings, and even extract text from images, complete with audio generation.
Language: HTML - Size: 9.13 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 2

Djmcflush/RaveFussion
A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.
Language: Python - Size: 8.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 3

Ate329/SentiMusic
A text-to-audio application that turns words and sentiments into melodies.
Language: Python - Size: 3.38 MB - Last synced at: 18 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

vishalnagda1/text-to-speech
Python program to convert text to speech.
Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 16

Yazdi9/Text-To-Audio-ChatGPT
Text To Audio (Voice, Music) -Support Chat-GPT
Language: Python - Size: 3.54 MB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

artinmohajeri/tkinter-text-to-voice
Language: Python - Size: 49.8 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

mohaimenulislamshawon/text-to-voice-speech-converter
The program is created based on google text to speech or voice converter machine. You can convert top 20 languages with this convert. I have made this for the educational & experimental perpose.
Language: HTML - Size: 12.7 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

Abdelhakim-gh/GenAI_Fusion_Multimodale
Workshop for Multimodale media generator
Language: Jupyter Notebook - Size: 23.9 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

ivan-guerra/morse
A text to Morse code translator
Language: Rust - Size: 228 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

dimitreOliveira/GenAI-GeoGuesser
Generative AI version of the GeoGuesser game.
Language: Python - Size: 453 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Kartiksood10/Text-to-Music-Generation-App
Generate Music using natural language prompts using Meta's MusicGen Small Model.
Language: Python - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

brayanjeshua/chatgpt-to-speech
CHATGPT Text-to-Speech Application
Language: JavaScript - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

anverbogatov/text-to-audio-tool
Simple Java based console tool that transforms your text files to audio files.
Language: Java - Size: 32.2 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

techguy940/text-to-speech
Text-to-Speech
Language: Python - Size: 2.93 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

vpakarinen/mmaudio-webui
WebUI for MMAudio Video-to-Audio and Text-to-Audio.
Language: Python - Size: 62.5 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

yashu1wwww/Multi-Language-Text-to-Speech-MP3-Downloader
Multi-Language Text-to-Speech MP3 Downloader Using Google TTS Api Library
Language: JavaScript - Size: 34.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 1

anandsuraj/ai-content-creation-reels
A Flask-based web application that leverages AI to generate various types of content including photo quotes, video reels, voice videos, and avatar videos.
Language: Python - Size: 21.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

01one/tts-server-python
Text To speech Server with python.. Simple Docker setup
Language: HTML - Size: 54.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sitamgithub-MSIT/tangoflux-litserve
Leverage TangoFlux's text-to-audio capabilities using LitServe.
Language: Python - Size: 255 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

armanjscript/Finetuning-LLMs
Fine Tuning large language models
Language: Jupyter Notebook - Size: 1.1 MB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

GAURITYAGI1/Text-to-Audio.Converter
Text⏩Audio Converter
Language: CSS - Size: 27.3 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

darylalim/generate-audio-audiocraft-audiogen
Generate audio from text with AudioCraft AudioGen.
Language: Jupyter Notebook - Size: 7.74 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Ajmal112/texttospeech
The Text-to-Speech website is a testing API project that enables users to effortlessly convert text or sentences into MP3 audio files. With its user-friendly interface, users can simply input their desired text, initiate the conversion process, and obtain an audio file in seconds, facilitating convenient access to spoken content from written text.
Language: HTML - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

OpenGenus/audio
Tool to convert an article to a summarized audio version [developed by OG Intern Ambarish Deb]
Language: Python - Size: 7.1 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

mython-dev/text-to-audio-converter
Convert Audio to Text using Telebot gTTS
Language: Python - Size: 128 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
