Topic: "audio-generation"
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
Language: Go - Size: 18.2 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 32,451 - Forks: 2,470

FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language: Python - Size: 1.5 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 13,629 - Forks: 1,381

open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language: Python - Size: 126 MB - Last synced at: 18 days ago - Pushed at: 29 days ago - Stars: 8,975 - Forks: 702

multimodal-art-projection/YuE
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Language: Python - Size: 32.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 4,945 - Forks: 541

haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language: Python - Size: 2.34 MB - Last synced at: 16 days ago - Pushed at: 5 months ago - Stars: 2,627 - Forks: 234

haoheliu/AudioLDM2
Text-to-Audio/Music Generation
Language: Python - Size: 3.73 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 2,419 - Forks: 190

rsxdalv/tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
Language: TypeScript - Size: 32.6 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 2,153 - Forks: 229

archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
Language: Python - Size: 245 KB - Last synced at: 28 days ago - Pushed at: almost 2 years ago - Stars: 2,035 - Forks: 173

archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
Size: 69.3 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1,897 - Forks: 71

lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python - Size: 264 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1,492 - Forks: 92

declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language: Python - Size: 19.5 MB - Last synced at: 6 months ago - Pushed at: 10 months ago - Stars: 1,086 - Forks: 88

FunAudioLLM/InspireMusic
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
Language: Python - Size: 3.64 MB - Last synced at: 3 days ago - Pushed at: 9 days ago - Stars: 1,084 - Forks: 100

NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language: Python - Size: 19.9 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 991 - Forks: 127

Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
Size: 1.28 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 665 - Forks: 54

modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language: Python - Size: 1.46 MB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 396 - Forks: 33

metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
Size: 358 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 382 - Forks: 17

v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Language: Jupyter Notebook - Size: 163 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 360 - Forks: 39

Yuan-ManX/audio-development-tools
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
Size: 2.18 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 346 - Forks: 24

researchmm/MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Language: Python - Size: 4.18 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 346 - Forks: 23

cabralpinto/modular-diffusion
Python library for designing and training your own Diffusion Models with PyTorch.
Language: Python - Size: 31.1 MB - Last synced at: 29 days ago - Pushed at: 10 months ago - Stars: 279 - Forks: 15

sony/bigvsan
Pytorch implementation of BigVSAN
Language: Python - Size: 14.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 203 - Forks: 18

happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
Language: Jupyter Notebook - Size: 23.9 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 182 - Forks: 13

soham97/awesome-sound_event_detection
Reading list for research topics in Sound AI
Size: 145 KB - Last synced at: about 17 hours ago - Pushed at: 9 months ago - Stars: 180 - Forks: 8

galgreshler/Catch-A-Waveform
Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
Language: Python - Size: 255 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 174 - Forks: 32

leopiney/neuralnoise
The AI Podcast Studio: generate podcasts scripts and their audio version with a team of AI workers in a Podcast Studio 🎙️📜
Language: Python - Size: 23.7 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 168 - Forks: 19

devnen/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 31.2 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 147 - Forks: 27

archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.
Language: Python - Size: 47.9 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 138 - Forks: 22

archinetai/audio-diffusion-pytorch-trainer
Trainer for audio-diffusion-pytorch
Language: Python - Size: 264 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 129 - Forks: 22

ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Language: Python - Size: 1.15 MB - Last synced at: 6 months ago - Pushed at: over 3 years ago - Stars: 119 - Forks: 15

RoySheffer/im2wav
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
Language: Python - Size: 23.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 95 - Forks: 9

sony/soundctm
Pytorch implementation of SoundCTM
Language: Python - Size: 3.76 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 86 - Forks: 7

rsxdalv/bark-speaker-directory
Site for sharing Bark voices
Language: TypeScript - Size: 21.1 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 51 - Forks: 0

olaviinha/NeuralTextToAudio
Text prompt steered synthetic audio generators
Language: Jupyter Notebook - Size: 337 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 46 - Forks: 7

JavisDiT/JavisDiT
Official implementation of "JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization"
Language: Python - Size: 53.4 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 43 - Forks: 2

rsxdalv/musicgen-prompts
Site for sharing MusicGen + AudioGen Prompts and Creations
Language: TypeScript - Size: 24.4 MB - Last synced at: 2 days ago - Pushed at: about 2 months ago - Stars: 42 - Forks: 5

PeiwenSun2000/Both-Ears-Wide-Open
The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
Language: Python - Size: 6.78 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 37 - Forks: 3

pollinations-ai/pollinations.ai
Work with the best generative AI from Pollinations using this Python SDK. 🐝
Language: Python - Size: 11.2 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 37 - Forks: 5

Yuanshi9815/LiteFocus
[Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.
Language: Python - Size: 805 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 33 - Forks: 0

Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language: Python - Size: 5.45 MB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 32 - Forks: 0

soham97/sound_ai_progress
Tracking states of the arts and recent results (bibliography) on sound tasks.
Size: 51.8 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 28 - Forks: 1

0417keito/JEN-1-COMPOSER-pytorch
Unofficial implementation JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation(https://arxiv.org/abs/2310.19180)
Language: Python - Size: 105 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 24 - Forks: 2

Warma10032/easytts
打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人有声小说。多说话人的语音合成,高质量的有声小说制作。
Language: Python - Size: 25.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 20 - Forks: 5

JosefAlbers/Aggressor
Ultra-minimal autoregressive diffusion model for image generation
Language: Python - Size: 2.36 MB - Last synced at: 22 days ago - Pushed at: 7 months ago - Stars: 18 - Forks: 2

LJungang/Awesome-Omni-Large-Models-and-Datasets
🔥 Omni large models and datasets for understanding and generating multi-modalities.
Size: 53.7 KB - Last synced at: 4 days ago - Pushed at: 7 months ago - Stars: 15 - Forks: 0

chenjianyi/fastsag
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
Language: Python - Size: 580 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 13 - Forks: 4

neeleshpandey/AutomatedNewsChannel
This is a Piece of code that fetches news using an API and Converts it into a NEWS video
Language: Python - Size: 96.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 4

Ceaglex/LoVA
The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) architecture, LoVA proves to be more effective at generating long-form audio compared to existing autoregressive models and UNet-based diffusion models.
Language: Python - Size: 3.19 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 12 - Forks: 2

JinjieNi/MixEval-X
The official github repo for MixEval-X, the first any-to-any, real-world benchmark.
Language: Python - Size: 1.24 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 12 - Forks: 0

zassou65535/WaveGAN
WaveGANによる音声生成器
Language: Python - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

bean980310/stable-diffusion-docker-project
Stable Diffusion WebUI and KohyaSS, ComfyUI, InvokeAI, Fooocus, and more Generative AI on Docker
Language: Jupyter Notebook - Size: 574 KB - Last synced at: 19 days ago - Pushed at: 29 days ago - Stars: 9 - Forks: 2

merekat/children-stories
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
Language: Jupyter Notebook - Size: 122 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 9 - Forks: 0

Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language: HTML - Size: 151 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 0

gianpaj/sexyvoice
Voice cloning and Text to Speech platform. Perfect for content creators, developers, and storytellers.
Language: TypeScript - Size: 2.46 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 6 - Forks: 2

shuklabhay/stereo-sample-gan
StereoSampleGAN: A computationally inexpensive approach high fidelity stereo audio sample generation.
Language: Python - Size: 73.1 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 6 - Forks: 0

klae01/ddim-audio
Denoising Diffusion Implicit Models
Language: Python - Size: 143 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 6 - Forks: 0

LumenPallidium/audio_generation
Experiments in neural networks for audio generation.
Language: Python - Size: 682 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 5 - Forks: 0

devaaravmishra/genius-saas
Genius-SaaS: An AI-powered SaaS application built with Next.js and React for personalized recommendations, dynamic content generation, and user behavior prediction. 🚀
Language: TypeScript - Size: 789 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 2

Lomasterrrr/Audio-Generator
Generates a sound given: volume, frequency, duration!
Language: C# - Size: 3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

inferless/bark
Text-to-Speech model that generates realistic, multilingual speech with music, background noise, and sound effects. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>
Language: Python - Size: 40 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 11

Justmalhar/tts-studio
Text to Speech Studio to convert text into natural-sounding speech using advanced AI models from leading providers like Replicate, OpenAI, and ElevenLabs.
Language: TypeScript - Size: 1.31 MB - Last synced at: 18 days ago - Pushed at: 4 months ago - Stars: 4 - Forks: 2

leelabcnbc/predictive-coding-music-prediction
Code implementation for the paper "Relating Human Perception of Musicality to Prediction in a Predictive Coding Model"
Language: Python - Size: 39.1 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

SD-inst/cozyui
A frontend for ComfyUI to generate AI videos comfortably
Language: TypeScript - Size: 1.11 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 3 - Forks: 0

Agentfy-io/Agentfy_API
🤖 This is the API component of Agentify, a FastAPI-based service that provides access to specialized AI agents. Each agent is exposed through its own API endpoints, enabling modular and focused functionality.
Language: Python - Size: 50 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 3 - Forks: 0

Yazdi9/Text-To-Audio-ChatGPT
Text To Audio (Voice, Music) -Support Chat-GPT
Language: Python - Size: 3.54 MB - Last synced at: 9 months ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

carlosholivan/AudioGenerationDiffusion
State-of-the-art of Audio Generation with Diffusion Models
Size: 179 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

carlosholivan/audiolm-google-torch
Implementation of the AudioLM model by Google in Pytorch
Size: 420 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

Sreyan88/Synthio
Code for ICLR 2025 Paper: Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Language: Python - Size: 2.29 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

SocAIty/socaity
SDK for generative AI.
Language: Python - Size: 24.2 MB - Last synced at: 26 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

inferless/parler-tts-streaming
text-to-speech with Server-Sent Events (SSE) streams real-time audio for chat-based applications. <metadata> gpu: A100 | collections: ["SSE Events"] </metadata>
Language: Python - Size: 9.77 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 7

inferless/pyannote-speaker-diarization-3.1
A state-of-the-art model that segments and labels audio recordings by accurately distinguishing different speakers. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>
Language: Python - Size: 23.4 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 2

vaishanth-rmrj/open-podcraft
OpenPodcraft is an open-source project that enables users to create podcasts from their textual content. With OpenPodcraft, you can either clone your own voice or use voices from different individuals to generate professional-sounding podcasts.
Language: Python - Size: 59.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

radoslawregula/VoxG
Singing voice synthesizer using GANs
Language: Python - Size: 145 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 1

0x7o/DeepMozart
Audio generation using diffusion models
Language: Python - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

Gmzxdotzz/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, and GPU/CPU execution.
Language: Python - Size: 572 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1 - Forks: 0

bocaletto-luca/CW-Generator
The CW (Morse) Generator is a versatile and powerful application for Morse communication enthusiasts and anyone interested in learning or practicing this classic language of communication. This software allows you to convert text to Morse code and vice versa, providing a complete suite of tools to create, interpret, and reproduce ...
Language: Python - Size: 21.5 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1 - Forks: 0

inferless/musicgen-stereo-melody-large
A 3.3B parameter text-to-music model by Meta AI, fine-tuned for stereo audio generation. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
Language: Python - Size: 29.3 KB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 1

lucadellalib/bigvgan
A single-file implementation of BigVGAN generator
Language: Python - Size: 395 KB - Last synced at: 7 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

SimonBernarding/OhanashiGPT-children-story-generation
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
Language: Jupyter Notebook - Size: 121 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

ashleykleynhans/stable-audio-tools-docker
Docker image for stable-audio-tools: Generative models for conditional audio generation
Language: Shell - Size: 41 KB - Last synced at: about 4 hours ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

sesac-google-ai-1st/video_factory
AI 기반으로 스크립트부터 더빙, 이미지 생성까지 all in one 영상 제작 서비스
Language: Python - Size: 62.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

MartianInGreen/BeeBrain
BeeBrain is your personal chatbot. Use tools, generate images, run code and so much more!
Language: Python - Size: 1.37 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sean-e/OptForAudio
Utility to temporarily change Windows system settings for improved real time audio performance
Language: C++ - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Yuan-ManX/ai-audio-processing-methods
ai audio processing methods
Language: Python - Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

mx-mark/SPMNet
Source code for "Visually aligned sound generation via sound-producing motion parsing" (Published at Neurocomputing)
Size: 4.88 KB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

inferless/hifi-gan-template
A TTS Vocoder capable of generating high fidelity speech efficiently.
Language: Python - Size: 5.98 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

inferless/melo-tts
A high-quality text-to-speech model by MyShell.ai that supports multiple English accents and real-time inference.
Language: Python - Size: 33.2 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

mahshid1378/tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
Language: TypeScript - Size: 3.96 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

TejasMate/SmartHomeAssistDevice
A modern voice-controlled home assistant device that can play music, generate images, create audio, and more. Supports both cloud-based and local language models for enhanced flexibility and privacy.
Language: Python - Size: 27.3 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

aidayang/InspireMusic-OneClick
InspireMusic文本转音乐软件免安装一键启动整合包
Size: 29.3 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

filipeusz123/aud
The Aud project is a simple but powerful text-to-speech software that allows users to convert written text into spoken words. It uses advanced algorithms to provide natural-sounding speech output, making it ideal for various applications such as accessibility tools, language learning, and audiobook creation.
Size: 1000 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

inferless/bark-streaming
Bark text-to-speech with Server-Sent Events (SSE) streams real-time audio, providing interactive and efficient updates for chat-based applications. <metadata> gpu: A100 | collections: ["SSE Events"] </metadata>
Language: Python - Size: 45.9 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 2

ewdlop/AI-Tools
Various AI Online Tools. AI is taking over https://www.fastcompany.com/; very sus
Size: 68.4 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

blaisewf/HiFi-SAN Fork of jik876/hifi-gan
HiFi-SAN: Slicing Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language: Python - Size: 619 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

m-faizan-mahmood/infinitnews.ai
This AI Automation Pipeline is a powerful, end-to-end solution designed to automate content creation across multiple formats, including text, images, and audio. This pipeline seamlessly integrates advanced AI technologies for text generation, image synthesis, and audio conversion, enabling efficient and high-quality content production.
Language: Jupyter Notebook - Size: 9.62 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Work-Nobu/OhanashiGPT
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
Language: Jupyter Notebook - Size: 108 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

KoppAlexander/Ohanashi-ChildGPT
OhanashiGPT is an application that generates personalized children's stories based on parameters like age and preferences. It narrates these stories using an AI-generated voice that mimics a parent, trained on their audio samples. The app also creates illustrations to accompany each story, providing a unique and engaging experience for children.
Language: Jupyter Notebook - Size: 108 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

swiss-ai-center/hugging-face-text-to-audio-service
The service is used to query text-to-audio AI models from the Hugging Face inference API.
Language: Python - Size: 538 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

langchain-tech/Musicgen-Text-to-Music
Turn your words into music! Describe a sound (e.g., happy, spooky) and this app generates a short piece based on your text.
Language: Python - Size: 5.86 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

mesudepolat/generative-ai
Various projects utilizing diverse generative AI techniques to produce audio, code, images, text, and Streamlit applications.
Language: Python - Size: 98.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

farahhuifanyang/Prelude-to-Listening
HKNME Concert (2023) Sanxian/sanshin, Ambisonics field recording, Generative AI
Size: 68.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
