Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: audio-generation

mudler/LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

Language: C++ - Size: 6.07 MB - Last synced: about 5 hours ago - Pushed: about 6 hours ago - Stars: 20,476 - Forks: 1,539

rsxdalv/tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)

Language: TypeScript - Size: 27.5 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 1,334 - Forks: 143

metame-ai/awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

Size: 124 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 191 - Forks: 5

soham97/awesome-sound_event_detection

Reading list for research topics in Sound AI

Size: 136 KB - Last synced: 4 days ago - Pushed: 22 days ago - Stars: 136 - Forks: 7

mx-mark/SPMNet

Source code for "Visually aligned sound generation via sound-producing motion parsing" (Published at Neurocomputing)

Size: 4.88 KB - Last synced: 8 days ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0

lucidrains/soundstorm-pytorch

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Language: Python - Size: 334 KB - Last synced: 15 days ago - Pushed: 16 days ago - Stars: 1,119 - Forks: 77

haoheliu/AudioLDM2

Text-to-Audio/Music Generation

Language: Python - Size: 1.54 MB - Last synced: 18 days ago - Pushed: about 2 months ago - Stars: 2,054 - Forks: 160

haoheliu/AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language: Python - Size: 2.4 MB - Last synced: 18 days ago - Pushed: 6 months ago - Stars: 2,232 - Forks: 212

archinetai/audio-diffusion-pytorch

Audio generation using diffusion models, in PyTorch.

Language: Python - Size: 245 KB - Last synced: 19 days ago - Pushed: 11 months ago - Stars: 1,784 - Forks: 154

archinetai/audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

Size: 69.3 KB - Last synced: 19 days ago - Pushed: 5 months ago - Stars: 1,865 - Forks: 66

archinetai/audio-data-pytorch

A collection of useful audio datasets and transforms for PyTorch.

Language: Python - Size: 47.9 KB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 122 - Forks: 21

alibaba-damo-academy/FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Language: Python - Size: 1.46 MB - Last synced: 28 days ago - Pushed: 4 months ago - Stars: 274 - Forks: 21

mesudepolat/generative-ai

Various projects utilizing diverse generative AI techniques to produce audio, code, images, text, and Streamlit applications.

Language: Python - Size: 98.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

declare-lab/tango

Hosts a family of diffusion models for text-to-audio generation.

Language: Python - Size: 17.7 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 880 - Forks: 68

open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language: Python - Size: 10.3 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3,824 - Forks: 304

0417keito/JEN-1-COMPOSER-pytorch

Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.19180)

Language: Python - Size: 105 MB - Last synced: 7 days ago - Pushed: 4 months ago - Stars: 24 - Forks: 2

archinetai/audio-diffusion-pytorch-trainer

Trainer for audio-diffusion-pytorch

Language: Python - Size: 264 KB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 124 - Forks: 22

RoySheffer/im2wav

Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation

Language: Python - Size: 23.7 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 95 - Forks: 9

Yuan-ManX/audio-development-tools

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

Size: 559 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 203 - Forks: 16

Yuan-ManX/ai-audio-datasets

This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.

Size: 354 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 267 - Forks: 24

galgreshler/Catch-A-Waveform

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Language: Python - Size: 255 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 174 - Forks: 32

cabralpinto/modular-diffusion

Python library for designing and training your own Diffusion Models with PyTorch.

Language: Python - Size: 31.1 MB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 252 - Forks: 9

rsxdalv/bark-speaker-directory

Site for sharing Bark voices

Language: TypeScript - Size: 21.1 MB - Last synced: 7 days ago - Pushed: 7 months ago - Stars: 45 - Forks: 0

rsxdalv/musicgen-prompts

Site for sharing MusicGen + AudioGen Prompts and Creations

Language: TypeScript - Size: 24.4 MB - Last synced: 7 days ago - Pushed: 9 months ago - Stars: 37 - Forks: 4

happylittlecat2333/Auffusion

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

Language: Jupyter Notebook - Size: 23.9 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 106 - Forks: 10

ml-dev-world/sonic-synth

This repository is a comprehensive guide and toolkit for music generation, featuring diverse algorithms, deep learning models, and creative techniques to inspire and assist in the composition of unique musical pieces.

Size: 5.86 KB - Last synced: about 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

sony/bigvsan

Pytorch implementation of BigVSAN

Language: Python - Size: 14.1 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 164 - Forks: 14

farahhuifanyang/Prelude-to-Listening

HKNME Concert (2023) Sanxian/sanshin, Ambisonics field recording, Generative AI

Size: 68.4 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

Bai-YT/ConsistencyTTA

ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

Language: Python - Size: 3.83 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 8 - Forks: 0

v-iashin/SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Language: Jupyter Notebook - Size: 163 MB - Last synced: 2 months ago - Pushed: 12 months ago - Stars: 309 - Forks: 36

ilaria-manco/word2wave

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Language: Python - Size: 1.15 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 116 - Forks: 16

Consistency-TTA/consistency-tta.github.io

Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

Language: HTML - Size: 144 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 7 - Forks: 0

sesac-google-ai-1st/video_factory

AI 기반으로 스크립트부터 더빙, 이미지 생성까지 all in one 영상 제작 서비스

Language: Python - Size: 62.8 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 0

soham97/sound_ai_progress

Tracking states of the arts and recent results (bibliography) on sound tasks.

Size: 51.8 KB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 27 - Forks: 1

ewdlop/AI-Tools

Various AI Online Tools. AI is taking over.

Size: 34.2 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

researchmm/MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Language: Python - Size: 4.17 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 286 - Forks: 21

MartianInGreen/BeeBrain

BeeBrain is your personal chatbot. Use tools, generate images, run code and so much more!

Language: Python - Size: 1.37 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 0

olaviinha/NeuralTextToAudio

Text prompt steered synthetic audio generators

Language: Jupyter Notebook - Size: 337 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 34 - Forks: 4

sean-e/OptForAudio

Utility to temporarily change Windows system settings for improved real time audio performance

Language: C++ - Size: 18.6 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

NVIDIA/BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Language: Python - Size: 14.1 MB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 571 - Forks: 63

heng-hw/V2A-Mapper

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

Size: 338 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

zassou65535/WaveGAN

WaveGANによる音声生成器

Language: Python - Size: 41 KB - Last synced: 7 months ago - Pushed: 10 months ago - Stars: 10 - Forks: 0

radoslawregula/VoxG

Singing voice synthesizer using GANs

Language: Python - Size: 145 KB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 1

devaaravmishra/genius-saas

Genius-SaaS: An AI-powered SaaS application built with Next.js and React for personalized recommendations, dynamic content generation, and user behavior prediction. 🚀

Language: TypeScript - Size: 789 KB - Last synced: 4 months ago - Pushed: 10 months ago - Stars: 5 - Forks: 2

gregogiudici/Knowledge-Distillation_DDSP-Decoder

Knowledge Distillation of different DDSP Decoders for audio signal generation

Language: Python - Size: 191 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

Lomasterrrr/Audio-Generator

Generates a sound given: volume, frequency, duration!

Language: C# - Size: 3 MB - Last synced: 8 months ago - Pushed: over 1 year ago - Stars: 5 - Forks: 1

ThomasRettig/chord-progression-to-midi

MIDI generator for chord progressions.

Language: Python - Size: 5.86 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

Anas436/Image-to-Audio-App

Image Captioning and Text-to-Speech

Language: Python - Size: 2.15 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0

leelabcnbc/predictive-coding-music-prediction

Code implementation for the paper "Relating Human Perception of Musicality to Prediction in a Predictive Coding Model"

Language: Python - Size: 39.1 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 4 - Forks: 0

LumenPallidium/audio_generation

Experiments in neural networks for audio generation.

Language: Python - Size: 681 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 3 - Forks: 0

saba99/Text-To-Audio-ChatGPT

Text To Audio (Voice, Music) -Support Chat-GPT

Language: Python - Size: 0 Bytes - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

neeleshpandey/AutomatedNewsChannel

This is a Piece of code that fetches news using an API and Converts it into a NEWS video

Language: Python - Size: 96.7 KB - Last synced: 11 months ago - Pushed: almost 3 years ago - Stars: 13 - Forks: 4

carlosholivan/audiolm-google-torch

Implementation of the AudioLM model by Google in Pytorch

Size: 420 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1

carlosholivan/AudioGenerationDiffusion

State-of-the-art of Audio Generation with Diffusion Models

Size: 179 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1

Yuan-ManX/ai-audio-processing-methods

ai audio processing methods

Language: Python - Size: 16.6 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

0x7o/DeepMozart

Audio generation using diffusion models

Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 1

klae01/ddim-audio

Denoising Diffusion Implicit Models

Language: Python - Size: 143 KB - Last synced: 12 months ago - Pushed: almost 2 years ago - Stars: 6 - Forks: 0