Awesome-Conditional-Diffusion-Models
This repository maintains a collection of important papers on conditional image synthesis with diffusion models (Survey Paper published in TMLR2025)
https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models
Last synced: 6 days ago
JSON representation
-
Paper Structure
-
Star History
-
Conditional Correction
- ![Star History Chart - history.com/#zju-pi/Awesome-Conditional-Diffusion-Models&Date)
- ![Star History Chart - history.com/#zju-pi/Awesome-Conditional-Diffusion-Models&Date)
-
-
Condition Integration in the Sampling Process
-
Attention Manipulation
- **Dragdiffusion: Harnessing diffusion models for interactive point-based image editing**
- **Stylediffusion: Controllable disentangled style transfer via diffusion models**
- **Custom-edit: Text-guided image editing with customized diffusion models**
- **Tf-icon: Diffusion-based training-free cross-domain image composition**
- **ediffi: Text-toimage diffusion models with an ensemble of expert denoisers**
- **Cones 2: Customizable image synthesis with multiple subjects**
- **Prompt-to-prompt image editing with cross attention control**
- **Plug-and-play diffusion features for text-driven image-to-image translation**
- **Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing**
- **Face aging via diffusion-based editing**
- **Dynamic prompt learning: Addressing cross-attention leakage for text-based image editing**
- **Focus on your instruction: Fine-grained and multi-instruction image editing by attention modulation**
- **Towards understanding cross and self-attention in stable diffusion for text-guided image editing**
- **Taming Rectified Flow for Inversion and Editing**
- **Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models**
- **Style injection in diffusion: A training-free approach for adapting large-scale diffusion models for style transfer**
-
Noise Blending
- **Effective real image editing with accelerated iterative diffusion inversion**
- **Sine: Single image editing with text-to-image diffusion models**
- **Compositional visual generation with composable diffusion models**
- **Classifier-free diffusion guidance**
- **Multidiffusion: Fusing diffusion paths for controlled image generation**
- **Pair-diffusion: Object-level image editing with structure-and-appearance paired diffusion models**
- **Magicfusion: Boosting text-to-image generation performance by fusing diffusion models**
- **Noisecollage: A layout-aware text-to-image diffusion model based on noise cropping and merging**
- **Ledits++: Limitless image editing using text-to-image models**
-
Guidance
- **Dragondiffusion: Enabling drag-style manipulation on diffusion models**
- **Freedom: Training-free energy-guided conditional diffusion model**
- **Training-free layout control with cross-attention guidance**
- **Generative diffusion prior for unified image restoration and enhancement**
- **Regeneration learning of diffusion models with rich prompts for zero-shot image translation**
- **Diffusion self-guidance for controllable image generation**
- **Energy-based cross attention for bayesian context update in text-to-image diffusion models**
- **Solving linear inverse problems provably via posterior sampling with latent diffusion models**
- **Readout guidance: Learning control from diffusion features**
- **Freecontrol: Training-free spatial control of any text-to-image diffusion model with any condition**
- **Diffeditor: Boosting accuracy and flexibility on diffusion-based image editing**
- **Diffusion posterior sampling for general noisy inverse problems**
- **Diffusion-based image translation using disentangled style and content representation**
- **Sketch-guided text-to-image diffusion models**
- **High-fidelity guided image synthesis with latent diffusion models**
- **Parallel diffusion models of operator and image for blind inverse problems**
- **Zero-shot image-to-image translation**
- **Universal guidance for diffusion models**
- **Pseudoinverse-guided diffusion models for inverse problems**
- **Diffusion models beat gans on image synthesis** - to-image | 2021.5 | NeurIPS2021 |
- **Blended diffusion for text-driven editing of natural images**
- **More control for free! image synthesis with semantic diffusion guidance** - to-image | 2021.12 | WACV2023 |
-
Conditional Correction
- **Improving diffusion models for inverse problems using manifold constraints**
- **Score-based generative modeling through stochastic differential equations**
- **ILVR: conditioning method for denoising diffusion probabilistic models**
- **Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction**
- **Repaint: Inpainting using denoising diffusion probabilistic models**
- **Diffedit: Diffusion-based semantic image editing with mask guidance**
- **Region-aware diffusion for zero-shot text-driven image editing**
- **Localizing object-level shape variations with text-to-image diffusion models**
- **Instructedit: Improving automatic masks for diffusion-based image editing with user instructions**
- **Text-driven image editing via learnable regions**
-
Inversion
- **Sdedit: Guided image synthesis and editing with stochastic differential equations**
- **Dual diffusion implicit bridges for image-to-image translation**
- **Null-text inversion for editing real images using guided diffusion models**
- **Edict: Exact diffusion inversion via coupled transformations**
- **A latent space of stochastic diffusion models for zero-shot image editing and guidance**
- **Inversion-based style transfer with diffusion models**
- **An edit friendly ddpm noise space: Inversion and manipulations**
- **Prompt tuning inversion for text-driven image editing using diffusion models**
- **Negative-prompt inversion: Fast image inversion for editing with textguided diffusion models**
- **Kv inversion: Kv embeddings learning for text-conditioned real image action editing**
- **Direct inversion: Boosting diffusion-based editing with 3 lines of code**
- **The blessing of randomness: Sde beats ode in general diffusionbased image editing**
- **Fixed-point inversion for text-to-image diffusion models**
-
Revising Diffusion Process
- **Snips: Solving noisy inverse problems stochastically**
- **Denoising diffusion restoration models**
- **Driftrec: Adapting diffusion models to blind jpeg restoration**
- **Zero-shot image restoration using denoising diffusion null-space model**
- **Image restoration with mean-reverting stochastic differential equations**
- **Inversion by direct iteration: An alternative to denoising diffusion for image restoration**
- **Resshift: Efficient diffusion model for image super-resolution by residual shifting**
- **Sinsr: diffusion-based image super-resolution in a single step**
-
-
Condition Integration in Denoising Networks
-
Condition Integration in the Specialization Stage
- **Encoder-based domain tuning for fast personalization of text-to-image models**
- **Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of diffusion models**
- **Preditor: Text guided image editing with diffusion prior**
- **Forgedit: Text guided image editing via learning and forgetting**
- **Prompting hard or hardly prompting: Prompt inversion for text-to-image diffusion models**
- **Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation**
- **Unitune: Text-driven image editing by fine tuning a diffusion model on a single image**
- **Multi-concept customization of text-to-image diffusion**
- **iedit: Localised text-guided image editing with weak supervision**
- **Svdiff: Compact parameter space for diffusion fine-tuning**
- **Cones: concept neurons in diffusion models for customized generation**
- **Layerdiffusion: Layered controlled image editing with diffusion models**
- **An image is worth one word: Personalizing text-to-image generation using textual inversion**
- **Imagic: Text-based real image editing with diffusion models**
- **Uncovering the disentanglement capability in text-to-image diffusion models**
-
Condition Integration in the Training Stage
- **Humandiffusion: a coarse-to-fine alignment diffusion framework for controllable text-driven person image generation**
- **Diffusion-based scene graph to image generation with masked contrastive pre-training**
- **Dolce: A model-based probabilistic diffusion framework for limited-angle ct reconstruction**
- **Zero-shot medical image translation via frequency-guided diffusion models**
- **Denoising diffusion probabilistic models for robust image super-resolution in the wild**
- **Resdiff: Combining cnn and diffusion model for image super-resolution**
- **Learned representation-guided diffusion models for large-image generation**
- **Low-light image enhancement with wavelet-based diffusion models**
- **Wavelet-based fourier information interaction with frequency diffusion adjustment for underwater image restoration**
- **Diffusion-based blind text image super-resolution**
- **Low-light image enhancement via clip-fourier guided wavelet diffusion**
- **Diffusion autoencoders: Toward a meaningful and decodable representation**
- **Semantic image synthesis via diffusion models**
- **A novel unified conditional scorebased generative framework for multi-modal medical image completion**
- **A morphology focused diffusion probabilistic model for synthesis of histopathology images**
- **Vector quantized diffusion model for text-to-image synthesis** - to-image | 2021.11 | CVPR2022 |
- **High-resolution image synthesis with latent diffusion models** - to-image | 2021.12 | CVPR2022 |
- **GLIDE: towards photorealistic image generation and editing with text-guided diffusion models** - to-image | 2021.12 | ICML2022 |
- **Hierarchical text-conditional image generation with CLIP latents** - to-image | 2022.4 | ARXIV2022 |
- **Photorealistic text-to-image diffusion models with deep language understanding** - to-image | 2022.5 | NeurIPS2022 |
- **PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis** - to-image | 2023.10 | ICLR2024 |
- **Scaling Rectified Flow Transformers for High-Resolution Image Synthesis** - to-image | 2024.03 | ICML2024 |
- **Srdiff: Single image super-resolution with diffusion probabilistic models**
- **Image super-resolution via iterative refinement**
- **Cascaded diffusion models for high fidelity image generation**
- **Palette: Image-to-image diffusion models**
-
Condition Integration in the Re-purposing Stage
- **Pretraining is all you need for image-to-image translation**
- **T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models**
- **Adding conditional control to text-to-image diffusion models**
- **Pair-diffusion: Object-level image editing with structure-and-appearance paired diffusion models**
- **Taming encoder for zero fine-tuning image customization with text-to-image diffusion models**
- **Instantbooth: Personalized text-to-image generation without test-time finetuning**
- **Blip-diffusion: pre-trained subject representation for controllable text-to-image generation and editing**
- **Fastcomposer: Tuning-free multi-subject image generation with localized attention**
- **Prompt-free diffusion: Taking” text” out of text-to-image diffusion models**
- **Paste,inpaint and harmonize via denoising: Subject-driven image editing with pre-trained diffusion model**
- **Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning**
- **Imagebrush: Learning visual in-context instructions for exemplar-based image manipulation**
- **Guiding instruction-based image editing via multimodal large language models**
- **Ranni: Taming text-to-image diffusion for accurate instruction following**
- **Smartedit: Exploring complex instruction-based image editing with multimodal large language models**
- **Instructany2pix: Flexible visual editing via multimodal instruction following**
- **Warpdiffusion: Efficient diffusion model for high-fidelity virtual try-on**
- **Coarse-to-fine latent diffusion for pose-guided person image synthesis**
- **Lightit: Illumination modeling and control for diffusion models**
- **Face2diffusion for fast and editable face personalization**
- **GLIGEN: open-set grounded text-to-image generation**
- **Elite: Encoding visual concepts into textual embeddings for customized text-to-image generation**
- **Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models**
- **Interactdiffusion: Interaction control in text-to-image diffusion models**
- **Instancediffusion: Instance-level control for image generation**
- **Deadiff: An efficient stylization diffusion model with disentangled representations**
- **Instructpix2pix: Learning to follow image editing instructions**
- **Paint by example: Exemplar-based image editing with diffusion models**
- **Objectstitch: Object compositing with diffusion model**
- **Smartbrush: Text and shape guided object inpainting with diffusion model**
- **Imagen editor and editbench: Advancing and evaluating text-guided image inpainting**
- **Reference-based image composition with sketch via structure-aware diffusion model**
- **Dialogpaint: A dialogbased image editing model**
- **Hive: Harnessing human feedback for instructional visual editing**
- **Inst-inpaint: Instructing to remove objects with diffusion models**
- **Text-to-image editing by image information removal**
- **Magicbrush: A manually annotated dataset for instruction-guided image editing**
- **Anydoor: Zero-shot object-level image customization**
- **Instructdiffusion: A generalist modeling interface for vision tasks**
- **Emu edit: Precise image editing via recognition and generation tasks**
- **Dreaminpainter: Text-guided subject-driven image inpainting with diffusion models**
-
Categories