Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: audio-generation
mudler/LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
Language: C++ - Size: 6.07 MB - Last synced: about 5 hours ago - Pushed: about 6 hours ago - Stars: 20,476 - Forks: 1,539
rsxdalv/tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
Language: TypeScript - Size: 27.5 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 1,334 - Forks: 143
metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
Size: 124 KB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 191 - Forks: 5
soham97/awesome-sound_event_detection
Reading list for research topics in Sound AI
Size: 136 KB - Last synced: 4 days ago - Pushed: 22 days ago - Stars: 136 - Forks: 7
mx-mark/SPMNet
Source code for "Visually aligned sound generation via sound-producing motion parsing" (Published at Neurocomputing)
Size: 4.88 KB - Last synced: 8 days ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0
lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python - Size: 334 KB - Last synced: 15 days ago - Pushed: 16 days ago - Stars: 1,119 - Forks: 77
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
Language: Python - Size: 1.54 MB - Last synced: 18 days ago - Pushed: about 2 months ago - Stars: 2,054 - Forks: 160
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language: Python - Size: 2.4 MB - Last synced: 18 days ago - Pushed: 6 months ago - Stars: 2,232 - Forks: 212
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
Language: Python - Size: 245 KB - Last synced: 19 days ago - Pushed: 11 months ago - Stars: 1,784 - Forks: 154
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
Size: 69.3 KB - Last synced: 19 days ago - Pushed: 5 months ago - Stars: 1,865 - Forks: 66
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.
Language: Python - Size: 47.9 KB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 122 - Forks: 21
alibaba-damo-academy/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language: Python - Size: 1.46 MB - Last synced: 28 days ago - Pushed: 4 months ago - Stars: 274 - Forks: 21
mesudepolat/generative-ai
Various projects utilizing diverse generative AI techniques to produce audio, code, images, text, and Streamlit applications.
Language: Python - Size: 98.8 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0
declare-lab/tango
Hosts a family of diffusion models for text-to-audio generation.
Language: Python - Size: 17.7 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 880 - Forks: 68
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language: Python - Size: 10.3 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 3,824 - Forks: 304
0417keito/JEN-1-COMPOSER-pytorch
Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.19180)
Language: Python - Size: 105 MB - Last synced: 7 days ago - Pushed: 4 months ago - Stars: 24 - Forks: 2
archinetai/audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
Language: Python - Size: 264 KB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 124 - Forks: 22
RoySheffer/im2wav
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
Language: Python - Size: 23.7 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 95 - Forks: 9
Yuan-ManX/audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
Size: 559 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 203 - Forks: 16
Yuan-ManX/ai-audio-datasets
This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.
Size: 354 KB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 267 - Forks: 24
galgreshler/Catch-A-Waveform
Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
Language: Python - Size: 255 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 174 - Forks: 32
cabralpinto/modular-diffusion
Python library for designing and training your own Diffusion Models with PyTorch.
Language: Python - Size: 31.1 MB - Last synced: about 1 month ago - Pushed: 7 months ago - Stars: 252 - Forks: 9
rsxdalv/bark-speaker-directory
Site for sharing Bark voices
Language: TypeScript - Size: 21.1 MB - Last synced: 7 days ago - Pushed: 7 months ago - Stars: 45 - Forks: 0
rsxdalv/musicgen-prompts
Site for sharing MusicGen + AudioGen Prompts and Creations
Language: TypeScript - Size: 24.4 MB - Last synced: 7 days ago - Pushed: 9 months ago - Stars: 37 - Forks: 4
happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
Language: Jupyter Notebook - Size: 23.9 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 106 - Forks: 10
ml-dev-world/sonic-synth
This repository is a comprehensive guide and toolkit for music generation, featuring diverse algorithms, deep learning models, and creative techniques to inspire and assist in the composition of unique musical pieces.
Size: 5.86 KB - Last synced: about 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0
sony/bigvsan
Pytorch implementation of BigVSAN
Language: Python - Size: 14.1 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 164 - Forks: 14
farahhuifanyang/Prelude-to-Listening
HKNME Concert (2023) Sanxian/sanshin, Ambisonics field recording, Generative AI
Size: 68.4 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0
Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language: Python - Size: 3.83 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 8 - Forks: 0
v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Language: Jupyter Notebook - Size: 163 MB - Last synced: 2 months ago - Pushed: 12 months ago - Stars: 309 - Forks: 36
ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Language: Python - Size: 1.15 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 116 - Forks: 16
Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language: HTML - Size: 144 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 7 - Forks: 0
sesac-google-ai-1st/video_factory
AI 기반으로 스크립트부터 더빙, 이미지 생성까지 all in one 영상 제작 서비스
Language: Python - Size: 62.8 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 0
soham97/sound_ai_progress
Tracking states of the arts and recent results (bibliography) on sound tasks.
Size: 51.8 KB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 27 - Forks: 1
ewdlop/AI-Tools
Various AI Online Tools. AI is taking over.
Size: 34.2 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0
researchmm/MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Language: Python - Size: 4.17 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 286 - Forks: 21
MartianInGreen/BeeBrain
BeeBrain is your personal chatbot. Use tools, generate images, run code and so much more!
Language: Python - Size: 1.37 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 1 - Forks: 0
olaviinha/NeuralTextToAudio
Text prompt steered synthetic audio generators
Language: Jupyter Notebook - Size: 337 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 34 - Forks: 4
sean-e/OptForAudio
Utility to temporarily change Windows system settings for improved real time audio performance
Language: C++ - Size: 18.6 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 1 - Forks: 0
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language: Python - Size: 14.1 MB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 571 - Forks: 63
heng-hw/V2A-Mapper
V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
Size: 338 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0
zassou65535/WaveGAN
WaveGANによる音声生成器
Language: Python - Size: 41 KB - Last synced: 7 months ago - Pushed: 10 months ago - Stars: 10 - Forks: 0
radoslawregula/VoxG
Singing voice synthesizer using GANs
Language: Python - Size: 145 KB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 1 - Forks: 1
devaaravmishra/genius-saas
Genius-SaaS: An AI-powered SaaS application built with Next.js and React for personalized recommendations, dynamic content generation, and user behavior prediction. 🚀
Language: TypeScript - Size: 789 KB - Last synced: 4 months ago - Pushed: 10 months ago - Stars: 5 - Forks: 2
gregogiudici/Knowledge-Distillation_DDSP-Decoder
Knowledge Distillation of different DDSP Decoders for audio signal generation
Language: Python - Size: 191 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0
Lomasterrrr/Audio-Generator
Generates a sound given: volume, frequency, duration!
Language: C# - Size: 3 MB - Last synced: 8 months ago - Pushed: over 1 year ago - Stars: 5 - Forks: 1
ThomasRettig/chord-progression-to-midi
MIDI generator for chord progressions.
Language: Python - Size: 5.86 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0
Anas436/Image-to-Audio-App
Image Captioning and Text-to-Speech
Language: Python - Size: 2.15 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 0 - Forks: 0
leelabcnbc/predictive-coding-music-prediction
Code implementation for the paper "Relating Human Perception of Musicality to Prediction in a Predictive Coding Model"
Language: Python - Size: 39.1 MB - Last synced: 10 months ago - Pushed: over 1 year ago - Stars: 4 - Forks: 0
LumenPallidium/audio_generation
Experiments in neural networks for audio generation.
Language: Python - Size: 681 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 3 - Forks: 0
saba99/Text-To-Audio-ChatGPT
Text To Audio (Voice, Music) -Support Chat-GPT
Language: Python - Size: 0 Bytes - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
neeleshpandey/AutomatedNewsChannel
This is a Piece of code that fetches news using an API and Converts it into a NEWS video
Language: Python - Size: 96.7 KB - Last synced: 11 months ago - Pushed: almost 3 years ago - Stars: 13 - Forks: 4
carlosholivan/audiolm-google-torch
Implementation of the AudioLM model by Google in Pytorch
Size: 420 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1
carlosholivan/AudioGenerationDiffusion
State-of-the-art of Audio Generation with Diffusion Models
Size: 179 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1
Yuan-ManX/ai-audio-processing-methods
ai audio processing methods
Language: Python - Size: 16.6 KB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0
0x7o/DeepMozart
Audio generation using diffusion models
Language: Python - Size: 6.84 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 1
klae01/ddim-audio
Denoising Diffusion Implicit Models
Language: Python - Size: 143 KB - Last synced: 12 months ago - Pushed: almost 2 years ago - Stars: 6 - Forks: 0