text-to-audio | Topic | Ecosyste.ms: Repos

Topic: "text-to-audio"

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language: Python - Size: 126 MB - Last synced at: 4 days ago - Pushed at: 15 days ago - Stars: 8,975 - Forks: 702

hkchengrex/MMAudio

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Language: Python - Size: 9.07 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,388 - Forks: 165

declare-lab/tango

A family of diffusion models for text-to-audio generation.

Language: Python - Size: 19.5 MB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 1,086 - Forks: 88

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Language: Python - Size: 18.2 MB - Last synced at: 18 days ago - Pushed at: 8 months ago - Stars: 1,053 - Forks: 80

gitmylo/audio-webui

A webui for different audio related Neural Networks

Language: Python - Size: 715 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 994 - Forks: 92

declare-lab/TangoFlux

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 664 - Forks: 59

Text-to-Audio/Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

Language: Python - Size: 961 KB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 646 - Forks: 87

lucidrains/nuwa-pytorch

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

Language: Python - Size: 1.78 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 546 - Forks: 56

ivcylc/OpenMusic

OpenMusic: SOTA Text-to-music (TTM) Generation

Language: Python - Size: 2.12 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 529 - Forks: 52

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

Language: HTML - Size: 12.7 MB - Last synced at: 12 days ago - Pushed at: 23 days ago - Stars: 455 - Forks: 26

AMAAI-Lab/mustango

Mustango: Toward Controllable Text-to-Music Generation

Language: Python - Size: 54.2 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 357 - Forks: 29

happylittlecat2333/Auffusion

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

Language: Jupyter Notebook - Size: 23.9 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 156 - Forks: 12

ilaria-manco/word2wave

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Language: Python - Size: 1.15 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 119 - Forks: 15

bnsantoso/sub-to-audio

Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.

Language: Python - Size: 99.6 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 105 - Forks: 13

sony/soundctm

Pytorch implementation of SoundCTM

Language: Python - Size: 3.76 MB - Last synced at: 20 days ago - Pushed at: 27 days ago - Stars: 86 - Forks: 7

keonlee9420/WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Language: Python - Size: 18 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 64 - Forks: 14

serp-ai/ai-text-to-audio-latent-diffusion Fork of Harmonai-org/sample-generator

text-to-audio-latent-diffusion

Language: Python - Size: 58.2 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 8

RhythrosaLabs/soundstorm

Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.

Language: Python - Size: 3.39 MB - Last synced at: 21 days ago - Pushed at: 12 months ago - Stars: 32 - Forks: 8

PapayaResearch/ctag

Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24

Language: Python - Size: 109 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 21 - Forks: 2

camenduru/audioldm-colab

AudioLDM text to audio colab

Language: Jupyter Notebook - Size: 24.4 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 3

kennethleungty/Text-to-Audio-with-Bark

Exploring Bark, the Open-Source Text-to-Audio Generative Model

Language: Jupyter Notebook - Size: 2.67 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 4

hkchengrex/av-benchmark

Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync

Language: Python - Size: 106 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 0

XiaomingX/awesome-ai-tools-for-game-dev

Awesome AI Tools for Game Development: A curated collection of the best AI tools, libraries, and resources to enhance game development workflows. From procedural content generation to NPC behavior, this repository gathers state-of-the-art AI solutions for game developers.

Size: 8.79 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 10 - Forks: 0

Consistency-TTA/consistency-tta.github.io

Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

Language: HTML - Size: 151 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 0

ahsplore/TalkitOut-TTS-web-application-python

TalkItOut is a Python and Flask-based web application that can convert text to speech, choose your preferred language for audio output, access a built-in dictionary for word meanings, and even extract text from images, complete with audio generation.

Language: HTML - Size: 9.13 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 2

Djmcflush/RaveFussion

A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.

Language: Python - Size: 8.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 3

Ate329/SentiMusic

A text-to-audio application that turns words and sentiments into melodies.

Language: Python - Size: 3.38 MB - Last synced at: 18 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

vishalnagda1/text-to-speech

Python program to convert text to speech.

Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 16

Yazdi9/Text-To-Audio-ChatGPT

Text To Audio (Voice, Music) -Support Chat-GPT

Language: Python - Size: 3.54 MB - Last synced at: 8 months ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

artinmohajeri/tkinter-text-to-voice

Language: Python - Size: 49.8 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

mohaimenulislamshawon/text-to-voice-speech-converter

The program is created based on google text to speech or voice converter machine. You can convert top 20 languages with this convert. I have made this for the educational & experimental perpose.

Language: HTML - Size: 12.7 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

Abdelhakim-gh/GenAI_Fusion_Multimodale

Workshop for Multimodale media generator

Language: Jupyter Notebook - Size: 23.9 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

ivan-guerra/morse

A text to Morse code translator

Language: Rust - Size: 228 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

dimitreOliveira/GenAI-GeoGuesser

Generative AI version of the GeoGuesser game.

Language: Python - Size: 453 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Kartiksood10/Text-to-Music-Generation-App

Generate Music using natural language prompts using Meta's MusicGen Small Model.

Language: Python - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

brayanjeshua/chatgpt-to-speech

CHATGPT Text-to-Speech Application

Language: JavaScript - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

anverbogatov/text-to-audio-tool

Simple Java based console tool that transforms your text files to audio files.

Language: Java - Size: 32.2 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

techguy940/text-to-speech

Text-to-Speech

Language: Python - Size: 2.93 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

vpakarinen/mmaudio-webui

WebUI for MMAudio Video-to-Audio and Text-to-Audio.

Language: Python - Size: 62.5 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

yashu1wwww/Multi-Language-Text-to-Speech-MP3-Downloader

Multi-Language Text-to-Speech MP3 Downloader Using Google TTS Api Library

Language: JavaScript - Size: 34.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 1

anandsuraj/ai-content-creation-reels

A Flask-based web application that leverages AI to generate various types of content including photo quotes, video reels, voice videos, and avatar videos.

Language: Python - Size: 21.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

01one/tts-server-python

Text To speech Server with python.. Simple Docker setup

Language: HTML - Size: 54.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sitamgithub-MSIT/tangoflux-litserve

Leverage TangoFlux's text-to-audio capabilities using LitServe.

Language: Python - Size: 255 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

armanjscript/Finetuning-LLMs

Fine Tuning large language models

Language: Jupyter Notebook - Size: 1.1 MB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

GAURITYAGI1/Text-to-Audio.Converter

Text⏩Audio Converter

Language: CSS - Size: 27.3 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

darylalim/generate-audio-audiocraft-audiogen

Generate audio from text with AudioCraft AudioGen.

Language: Jupyter Notebook - Size: 7.74 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Ajmal112/texttospeech

The Text-to-Speech website is a testing API project that enables users to effortlessly convert text or sentences into MP3 audio files. With its user-friendly interface, users can simply input their desired text, initiate the conversion process, and obtain an audio file in seconds, facilitating convenient access to spoken content from written text.

Language: HTML - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

OpenGenus/audio

Tool to convert an article to a summarized audio version [developed by OG Intern Ambarish Deb]

Language: Python - Size: 7.1 MB - Last synced at: 8 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

mython-dev/text-to-audio-converter

Convert Audio to Text using Telebot gTTS

Language: Python - Size: 128 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0