An open API service providing repository metadata for many open source software ecosystems.

Topic: "text-to-image"

lucidrains/DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Language: Python - Size: 3.76 MB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 11,270 - Forks: 1,092

lucidrains/imagen-pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Language: Python - Size: 1.07 MB - Last synced at: 18 days ago - Pushed at: 8 months ago - Stars: 8,264 - Forks: 783

XavierXiao/Dreambooth-Stable-Diffusion

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

Language: Jupyter Notebook - Size: 5.71 MB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 7,725 - Forks: 801

jamez-bondos/awesome-gpt4o-images

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capabilities.

Language: JavaScript - Size: 140 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 6,083 - Forks: 544

lucidrains/DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Language: Python - Size: 13.5 MB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 5,616 - Forks: 640

promptslab/Awesome-Prompt-Engineering

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

Language: Python - Size: 187 KB - Last synced at: 10 days ago - Pushed at: 11 months ago - Stars: 4,518 - Forks: 425

lucidrains/deep-daze

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

Language: Python - Size: 6.68 MB - Last synced at: 17 days ago - Pushed at: about 3 years ago - Stars: 4,361 - Forks: 318

kuprel/min-dalle

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch

Language: Python - Size: 46.5 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 3,487 - Forks: 252

YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy

Diffusion model papers, survey, and taxonomy

Size: 272 KB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 3,183 - Forks: 265

filipecalegario/awesome-generative-ai

A curated list of Generative AI tools, works, models, and references

Size: 1.16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,855 - Forks: 493

ai-forever/Kandinsky-2

Kandinsky 2 — multilingual text2image latent diffusion model

Language: Jupyter Notebook - Size: 37.3 MB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 2,798 - Forks: 310

saharmor/dalle-playground

A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

Language: JavaScript - Size: 3.01 MB - Last synced at: about 9 hours ago - Pushed at: about 1 year ago - Stars: 2,761 - Forks: 596

nerdyrodent/VQGAN-CLIP

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Language: Python - Size: 31.7 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 2,650 - Forks: 432

lucidrains/big-sleep

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun

Language: Python - Size: 6.89 MB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 2,571 - Forks: 306

FurkanGozukara/Stable-Diffusion

FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod

Language: Jupyter Notebook - Size: 3.33 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,436 - Forks: 332

Yutong-Zhou-cv/Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

Size: 69.2 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 2,339 - Forks: 200

SamurAIGPT/AI-Youtube-Shorts-Generator

A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.

Language: Python - Size: 99 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 2,219 - Forks: 314

Lightricks/ComfyUI-LTXVideo

LTX-Video Support for ComfyUI

Language: Python - Size: 4.56 MB - Last synced at: 1 day ago - Pushed at: 23 days ago - Stars: 2,027 - Forks: 178

carefree0910/carefree-creator

AI magics meet Infinite draw board.

Language: Jupyter Notebook - Size: 8.06 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 1,945 - Forks: 181

bytedance/InfiniteYou

🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Language: Python - Size: 13.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1,877 - Forks: 133

YangLing0818/RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Language: Jupyter Notebook - Size: 64.2 MB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 1,802 - Forks: 101

THUDM/CogView

Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".

Language: Python - Size: 12.4 MB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 1,786 - Forks: 179

omerbt/TokenFlow

Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)

Language: Python - Size: 27.4 MB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 1,657 - Forks: 139

ai-forever/ru-dalle

Generate images from texts. In Russian

Language: Jupyter Notebook - Size: 26.9 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 1,646 - Forks: 245

TencentARC/BrushNet

[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"

Language: Python - Size: 37 MB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 1,603 - Forks: 134

fofr/cog-face-to-many

Turn any face into a video game character, pixel art, claymation, 3D or toy

Language: Python - Size: 32.2 KB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 1,338 - Forks: 204

FoundationVision/Infinity

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Language: Python - Size: 10.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,210 - Forks: 55

Capsize-Games/airunner

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows

Language: Python - Size: 29.3 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,180 - Forks: 92

lukasHoel/text2room

Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).

Language: Python - Size: 8.85 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 1,051 - Forks: 73

THUDM/CogView4

CogView4, CogView3-Plus and CogView3(ECCV 2024)

Language: Python - Size: 24.1 MB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 1,043 - Forks: 76

PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models

A collection of resources on controllable generation with text-to-image diffusion models.

Size: 3.04 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 1,042 - Forks: 28

omerbt/MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Language: Jupyter Notebook - Size: 6.93 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 1,032 - Forks: 60

ddPn08/Radiata

Stable diffusion webui based on diffusers.

Language: Python - Size: 15.6 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 981 - Forks: 68

THUDM/CogView2

official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"

Language: Python - Size: 148 KB - Last synced at: 14 days ago - Pushed at: almost 3 years ago - Stars: 951 - Forks: 77

lucidrains/muse-maskgit-pytorch

Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch

Language: Python - Size: 285 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 894 - Forks: 83

finegrain-ai/refiners

A microframework on top of PyTorch with first-class citizen APIs for foundation model adaptation

Language: Python - Size: 125 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 826 - Forks: 62

Shilin-LU/TF-ICON

[ICCV 2023] "TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition" (Official Implementation)

Language: Python - Size: 75.1 MB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 819 - Forks: 102

haofanwang/Lora-for-Diffusers

The most easy-to-understand tutorial for using LoRA (Low-Rank Adaptation) within diffusers framework for AI Generation Researchers🔥

Language: Python - Size: 97.7 KB - Last synced at: 14 days ago - Pushed at: about 1 year ago - Stars: 805 - Forks: 53

eps696/aphantasia

CLIP + FFT/DWT/RGB = text to image/video

Language: Python - Size: 35.2 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 786 - Forks: 103

mfrashad/text2art

AI-powered Text-to-Art Generator - Text2Art.com

Language: Jupyter Notebook - Size: 31.5 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 785 - Forks: 206

fboulnois/stable-diffusion-docker

Run the official Stable Diffusion releases in a Docker container with txt2img, img2img, depth2img, pix2pix, upscale4x, and inpaint.

Language: Python - Size: 666 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 746 - Forks: 132

yuval-alaluf/Attend-and-Excite

Official Implementation for "Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models" (SIGGRAPH 2023)

Language: Jupyter Notebook - Size: 103 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 736 - Forks: 61

vicgalle/stable-diffusion-aesthetic-gradients

Personalization for Stable Diffusion via Aesthetic Gradients 🎨

Language: Jupyter Notebook - Size: 92.5 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 730 - Forks: 62

jianzhnie/awesome-text-to-video

A Survey on Text-to-Video Generation/Synthesis.

Size: 41 KB - Last synced at: 28 days ago - Pushed at: 11 months ago - Stars: 713 - Forks: 91

PaddlePaddle/PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Language: Python - Size: 179 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 648 - Forks: 216

fofr/cog-face-to-sticker

face-to-sticker

Language: Python - Size: 72.3 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 641 - Forks: 64

SkyWorkAIGC/SkyPaint-AI-Diffusion

基于Stable Diffusion优化的AI绘画模型。支持输入中英文文本,可生成多种现代艺术风格的高质量图像。| An optimized text-to-image model based on Stable Diffusion. Both Chinese and English text inputs are available to generate images. The model can generate high-quality images in several modern art styles.

Size: 7.74 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 630 - Forks: 37

ChaofanTao/Autoregressive-Models-in-Vision-Survey

[TMLR 2025🔥] A survey for the autoregressive models in vision.

Size: 7.68 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 627 - Forks: 19

ChenWu98/cycle-diffusion

[ICCV 2023] A latent space for stochastic diffusion models

Language: Python - Size: 52.5 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 617 - Forks: 36

zsdonghao/text-to-image

Generative Adversarial Text to Image Synthesis / Please Star -->

Language: Python - Size: 755 KB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 602 - Forks: 161

limuloo/MIGC

[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)

Language: Python - Size: 33.4 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 595 - Forks: 29

AlonzoLeeeooo/awesome-text-to-image-studies

A collection of awesome text-to-image generation studies.

Language: TeX - Size: 2.09 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 595 - Forks: 32

omriav/blended-latent-diffusion

Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]

Language: Jupyter Notebook - Size: 9.84 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 594 - Forks: 37

omriav/blended-diffusion

Official implementation for "Blended Diffusion for Text-driven Editing of Natural Images" [CVPR 2022]

Language: Jupyter Notebook - Size: 42.4 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 574 - Forks: 43

gojasper/flash-diffusion

⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)

Language: Python - Size: 50.4 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 566 - Forks: 40

ironjr/semantic-draw

Official code for the CVPR 2025 paper "SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models."

Language: Jupyter Notebook - Size: 334 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 562 - Forks: 49

akanimax/T2F

T2F: text to face generation using Deep Learning

Language: Python - Size: 498 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 548 - Forks: 100

lucidrains/parti-pytorch

Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch

Language: Python - Size: 339 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 532 - Forks: 24

SamurAIGPT/Text-To-Video-AI

Generate video from text using AI

Language: Jupyter Notebook - Size: 15.8 MB - Last synced at: 16 days ago - Pushed at: 4 months ago - Stars: 529 - Forks: 190

jaketae/storyteller

Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech

Language: Python - Size: 4.59 MB - Last synced at: 13 days ago - Pushed at: almost 2 years ago - Stars: 523 - Forks: 65

google/break-a-scene

Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]

Language: Python - Size: 13.1 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 520 - Forks: 25

AlaaLab/InstructCV

[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"

Language: Python - Size: 76.1 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 516 - Forks: 46

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

Language: HTML - Size: 12.7 MB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 472 - Forks: 26

atfortes/Awesome-Controllable-Diffusion

Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, IP-Adapter.

Size: 37.6 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 467 - Forks: 28

TonyLianLong/LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD, TMLR 2024)

Language: Python - Size: 268 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 465 - Forks: 33

afiaka87/clip-guided-diffusion

A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.

Language: Python - Size: 51.2 MB - Last synced at: 27 days ago - Pushed at: over 3 years ago - Stars: 462 - Forks: 60

EleutherAI/DALLE-mtf

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Language: Python - Size: 272 KB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 433 - Forks: 46

aelnouby/Text-to-Image-Synthesis

Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper

Language: Python - Size: 454 KB - Last synced at: 19 days ago - Pushed at: almost 5 years ago - Stars: 410 - Forks: 89

Auto1111SDK/Auto1111SDK

An SDK/Python library for Automatic 1111 to run state-of-the-art diffusion models

Language: Python - Size: 10.6 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 403 - Forks: 28

Shilin-LU/MACE

[CVPR 2024] "MACE: Mass Concept Erasure in Diffusion Models" (Official Implementation)

Language: Jupyter Notebook - Size: 28.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 395 - Forks: 32

open-mmlab/StyleShot

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!

Language: Python - Size: 97.1 MB - Last synced at: 15 days ago - Pushed at: 4 months ago - Stars: 389 - Forks: 32

nerdyrodent/CLIP-Guided-Diffusion

Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Language: Python - Size: 2.09 MB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 387 - Forks: 49

garibida/cross-image-attention

Officail Implementation for "Cross-Image Attention for Zero-Shot Appearance Transfer"

Language: Python - Size: 64.6 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 359 - Forks: 27

OSU-NLP-Group/MagicBrush

[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".

Language: Python - Size: 112 MB - Last synced at: 15 days ago - Pushed at: 4 months ago - Stars: 356 - Forks: 14

jonathandinu/ai4artists

A list of AI Art courses, tools, libraries, people, and places.

Size: 661 KB - Last synced at: 28 days ago - Pushed at: 12 months ago - Stars: 355 - Forks: 25

sayakpaul/diffusers-torchao

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).

Language: Python - Size: 187 KB - Last synced at: 3 days ago - Pushed at: 8 days ago - Stars: 354 - Forks: 12

MiniMax-AI/MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Language: Python - Size: 113 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 354 - Forks: 30

FoundationVision/Liquid

Liquid: Language Models are Scalable and Unified Multi-modal Generators

Language: Python - Size: 31.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 353 - Forks: 24

snap-research/stable-flow

Official implementation for "Stable Flow: Vital Layers for Training-Free Image Editing" [CVPR 2025]

Language: Python - Size: 2.9 MB - Last synced at: 23 days ago - Pushed at: 4 months ago - Stars: 348 - Forks: 22

awekrx/ChatGPT-MidJourney-prompt

This is a ChatGPT based prompt generation model for MidJorney. The purpose of this model is to simplify the creation of images and increase their creativity. By introducing a partial hint, ChatGPT creates a follow-up that can be used to stimulate creativity and provide new ideas.

Language: Python - Size: 11.7 MB - Last synced at: 24 days ago - Pushed at: about 2 years ago - Stars: 337 - Forks: 51

jabir-zheng/TCD

Official Repository of the paper "Trajectory Consistency Distillation"

Language: Python - Size: 100 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 327 - Forks: 13

woctezuma/stable-diffusion-colab

Colab notebook for Stable Diffusion Hyper-SDXL.

Language: Jupyter Notebook - Size: 52.7 KB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 325 - Forks: 81

mkshing/e4t-diffusion

Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Language: Python - Size: 4.04 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 324 - Forks: 24

tobran/DF-GAN

[CVPR2022 oral] A Simple and Effective Baseline for Text-to-Image Synthesis

Language: Python - Size: 3.26 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 314 - Forks: 69

MohamadZeina/Disco_Diffusion_Local

Getting the latest versions of Disco Diffusion to work locally, instead of colab. Including how I run this on Windows, despite some Linux only dependencies ;)

Language: Jupyter Notebook - Size: 2.16 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 312 - Forks: 36

SamurAIGPT/AI-Faceless-Video-Generator

Generate a video script, voice and a talking face completely with AI

Language: Jupyter Notebook - Size: 16.6 MB - Last synced at: 14 days ago - Pushed at: 4 months ago - Stars: 305 - Forks: 49

AssemblyAI-Community/MinImagen

MinImagen: A minimal implementation of the Imagen text-to-image model

Language: Python - Size: 6.53 MB - Last synced at: 18 days ago - Pushed at: about 2 years ago - Stars: 304 - Forks: 57

FoundationVision/UniTok

A Unified Tokenizer for Visual Generation and Understanding

Language: Python - Size: 30.7 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 294 - Forks: 5

viiika/Meissonic

[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Language: Python - Size: 148 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 289 - Forks: 10

wooyeolbaek/attention-map-diffusers

🚀 Cross attention map tools for huggingface/diffusers

Language: Python - Size: 7.89 MB - Last synced at: 15 days ago - Pushed at: 5 months ago - Stars: 286 - Forks: 21

rinnakk/japanese-stable-diffusion

Japanese Stable Diffusion is a Japanese specific latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

Language: Jupyter Notebook - Size: 3.24 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 274 - Forks: 15

byliutao/1Prompt1Story

🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt

Language: Python - Size: 29.9 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 265 - Forks: 32

HolmesShuan/FireFlow-Fast-Inversion-of-Rectified-Flow-for-Image-Semantic-Editing

[ICML2025] An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)

Language: Python - Size: 27.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 262 - Forks: 15

Karine-Huang/T2I-CompBench

[Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation

Language: Python - Size: 77.4 MB - Last synced at: 27 days ago - Pushed at: about 2 months ago - Stars: 255 - Forks: 10

kfirgoldberg/ConceptLab

Official Implementation for "ConceptLab: Creative Generation using Diffusion Prior Constraints"

Language: Python - Size: 137 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 250 - Forks: 18

ashbuilds/payload-ai

AI Plugin is a powerful extension for the Payload CMS, integrating advanced AI capabilities to enhance content creation and management.

Language: TypeScript - Size: 82.3 MB - Last synced at: about 6 hours ago - Pushed at: 9 days ago - Stars: 238 - Forks: 37

tobran/GALIP

[CVPR2023] A faster, smaller, and better text-to-image model for large-scale training

Language: Python - Size: 1.17 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 238 - Forks: 30

lucidrains/perfusion-pytorch

Implementation of Key-Locked Rank One Editing, from Nvidia AI

Language: Python - Size: 3.14 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 235 - Forks: 7

KwokKwok/Silo

多模型同时对话、文生图,纯前端。Multi-model simultaneous chat、text-to-image generation, all done through pure front-end (API mode, no server-side needed).

Language: JavaScript - Size: 2.31 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 231 - Forks: 25

lzyhha/VisualCloze

VisualCloze: A universal image generation framework that can support a wide range of in-domain tasks and generalize to unseen ones. (🔥 🔥 🔥 Merged into offical pipelines of diffusers.)

Language: Python - Size: 129 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 230 - Forks: 11

Related Topics