Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-Controllable-Diffusion
Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, T2I-Adapter, IP-Adapter.
https://github.com/atfortes/Awesome-Controllable-Diffusion
Last synced: about 17 hours ago
JSON representation
-
Diffusion Models
- Cross-Image Attention for Zero-Shot Appearance Transfer.
- The Chosen One: Consistent Characters in Text-to-Image Diffusion Models.
- MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer.
- PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding.
- Multi-LoRA Composition for Image Generation
- InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning.
- ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond.
- ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs.
- StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter.
- Style Aligned Image Generation via Shared Attention.
- Context Diffusion: In-Context Aware Image Generation.
- PALP: Prompt Aligned Personalization of Text-to-Image Models.
- Training-Free Consistent Text-to-Image Generation
- InstanceDiffusion: Instance-level Control for Image Generation
- Text2Street: Controllable Text-to-image Generation for Street Views
- ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
- Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
- MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
- Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding
- StyleBooth: Image Style Editing with Multimodal Instruction
- FaceStudio: Put Your Face Everywhere in Seconds.
- Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs.
- DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation.
- Subject-driven Text-to-Image Generation via Apprenticeship Learning.
- StyleDrop: Text-to-Image Generation in Any Style.
- DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models.
- An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning.
- CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
- BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
- Kosmos-G: Generating Images in Context with Multimodal Large Language Models
- Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding
- Adding Conditional Control to Text-to-Image Diffusion Models.
- UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion.
- T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models.
- Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
- Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
- ComFusion: Personalized Subject Generation in Multiple Specific Scenes From Single Image
- Visual Style Prompting with Swapping Self-Attention
- Direct Consistency Optimization for Compositional Text-to-Image Personalization
- MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion
- RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
- MultiBooth: Towards Generating All Your Concepts in an Image from Text
- InstantID: Zero-shot Identity-Preserving Generation in Seconds.
- λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
- IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
- Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
- FlashFace: Human Image Personalization with High-fidelity Identity Preservation
- Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models
- Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
- MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models
- - ff0000?style=flat-square)
- Face0: Instantaneously Conditioning a Text-to-Image Model on a Face.
- Controlling Text-to-Image Diffusion by Orthogonal Finetuning.
- Zero-shot spatial layout conditioning for text-to-image diffusion models.
- IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models.
- StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation.
- DreamTuner: Single Image is Enough for Subject-Driven Generation.
- InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
- StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
- Customizing Text-to-Image Models with a Single Image Pair
- MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
- ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning
- PuLID: Pure and Lightning ID Customization via Contrastive Alignment
- Compositional Text-to-Image Generation with Dense Blob Representations
- FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition
- RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control
- - -Driven-orange?style=flat-square)![](https://img.shields.io/badge/Style-ff0000?style=flat-square)![](https://img.shields.io/badge/Compositionality-5218fa?style=flat-square)
-
Consistency Models
- CCM: Adding Conditional Controls to Text-to-Image Consistency Models - blue?style=flat-square)![](https://img.shields.io/badge/Layout-a50b5e?style=flat-square)
- PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models - blue?style=flat-square)![](https://img.shields.io/badge/Layout-a50b5e?style=flat-square)
-
Benchmark
-
Technique
- [Project
- [Blog
- [Project - liu/LLaVA)], 2023.10
- [Project - CAIR/MiniGPT-4)], 2023.4
- [Paper
- [Paper
- [Paper
- [Paper - gpt.github.io/)], 2023.7
- [Paper
- [Paper - stanford/med-flamingo)], 2023.7
- [Paper - VL)], 2023.8
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Project - research/google-research/tree/master/socraticmodels)], 2022.4
- [Paper - science/mm-cot)], 2023.2
- [Paper - chatgpt)], 2023.3
- [Project - REACT)] [[Demo](https://huggingface.co/spaces/microsoft-cognitive-service/mm-react)], 2023.3
- [Paper
- [Paper - portal/Link-Context-Learning)], 20233.8
- [Paper
- [Project - iep)], 2017.5
- [Project - vqa)], 2018.10
- [Project
- [Project - columbia/viper)], 2023.3
- [Paper
- [Project - llm)], 2023.4
- [Paper
- [Project - vid.github.io/#video-demos)], 2023.10
- [Paper
- [Blog
- [Paper
- [Paper
- [Project
-
Star History
- ![Star History Chart - history.com/#atfortes/Awesome-Controllable-Generation&Timeline)
Sub Categories