An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: multimodal-generation

wzk1015/Awesome-Vision-to-Music-Generation

[ISMIR 2025] A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.

Size: 1.25 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 67 - Forks: 2

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

Language: HTML - Size: 12.7 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 479 - Forks: 28

eric-ai-lab/MiniGPT-5

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"

Language: Python - Size: 61.9 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 863 - Forks: 51

PanguIR/MRAGSurvey

A Survey of Multimodal Retrieval-Augmented Generation

Size: 4.92 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 11 - Forks: 1

YangLing0818/ContextDiff

[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video Generation

Language: Python - Size: 97 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 66 - Forks: 4

Nithin-GK/UniteandConquer

[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models

Language: Python - Size: 6.55 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 28 - Forks: 3

chuhaojin/Text2Poster-ICASSP-22

Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

Language: Python - Size: 50.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 171 - Forks: 12