Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Nihukat/Concept-Conductor
https://github.com/Nihukat/Concept-Conductor
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/Nihukat/Concept-Conductor
- Owner: Nihukat
- Created: 2024-08-02T09:53:26.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-11-16T12:48:41.000Z (2 months ago)
- Last Synced: 2024-11-16T13:33:32.682Z (2 months ago)
- Language: Python
- Size: 129 MB
- Stars: 8
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
Concept Conductor
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis
Zebin Yao, Fangxiang Feng, Ruifan Li, Xiaojie Wang
Beijing University of Posts and Telecommunications
[![Project Website](https://img.shields.io/badge/Project-Website-orange)](https://nihukat.github.io/Concept-Conductor/)
[![arXiv](https://img.shields.io/badge/arXiv-<2408.03632>-.svg)](https://arxiv.org/abs/2408.03632)## 🔍 Results
### Combination of 2 Concepts:
### Combination of More Than 2 Concepts:
## 🛠️ Installation
```bash
git clone https://github.com/Nihukat/Concept-Conductor.git
cd Concept-Conductor
pip install -r requirements.txt
```## 📝 Preparation
### 1. Download Pretrained Text-to-Image Models.
We implemented our method on both Stable Diffusion 1.5 and SDXL 1.0 respectively.
For Stable Diffusion 1.5, we adopt [ChilloutMix](https://civitai.com/models/6424/chilloutmix) for real-world concepts and [Anything-v4](https://huggingface.co/xyn-ai/anything-v4.0) for anime concepts.
```bash
cd experiments/pretrained_models# Diffusers-version ChilloutMix
git-lfs clone https://huggingface.co/windwhinny/chilloutmix.git# Diffusers-version Anything-v4
git-lfs clone https://huggingface.co/xyn-ai/anything-v4.0.git
```For SDXL 1.0, we adopt [RealVisXL V5.0](https://civitai.com/models/139562?modelVersionId=789646) for real-world concepts and [Anything-XL](https://civitai.com/models/9409/or-anything-xl) for anime concepts.
```bash
cd experiments/pretrained_models# Diffusers-version RealVisXL V5.0
git-lfs clone https://huggingface.co/SG161222/RealVisXL_V5.0.git# Diffusers-version Anything-XL
git-lfs clone https://huggingface.co/eienmojiki/Anything-XL.git
```### 2. (Optional) Train ED-LoRAs.
We adopt ED-LoRAs (proposed in [Mix-of-Show](https://github.com/TencentARC/Mix-of-Show)) as single-concept customization models.
If you want to train ED-LoRAs yourself, you can download the training data we used in our paper on [Google Drive](https://drive.google.com/drive/folders/1roYyOL7e5Ivx3lvLAXz8XKY00sDLC377?usp=drive_link).You can also construct personalized concept datasets with your own custom images and corresponding text captions, referring to the structure of our dataset directory.
We provide training scripts for both Stable Diffusion 1.5 and SDXL 1.0.
**For Stable Diffusion 1.5 :**
```bash
# Train ED-LoRAs for real-world concepts
python train_edlora.py -opt configs/edlora/train/chow_dog.yml# Train ED-LoRAs for anime concepts
python train_edlora.py -opt configs/edlora/train/mitsuha_girl.yml
```**For SDXL 1.0 :**
```bash
# Train ED-LoRAs for real-world concepts
python train_edlora_sdxl.py -opt configs/edlora/train_sdxl/chow_dog.yml# Train ED-LoRAs for anime concepts
python train_edlora_sdxl.py -opt configs/edlora/train_sdxl/mitsuha_girl.yml
```### 3. (Optional) Download our trained ED-LoRAs.
To quickly reimplement our method, you can download our trained ED-LoRAs from [Google Drive](https://drive.google.com/drive/folders/1roYyOL7e5Ivx3lvLAXz8XKY00sDLC377?usp=drive_link).
## 🚀 Usage
### Generate multiple personalized concepts in an image
**For Stable Diffusion 1.5 :**
```bash
python sample.py \
--ref_prompt "A dog and a cat in the street." \
--base_prompt "A dog and a cat on the beach." \
--custom_prompts "A on the beach." "A on the beach."\
--ref_image_path "examples/a dog and a cat in the street.png" \
--ref_mask_paths "examples/a dog and a cat in the street_mask1.png" "examples/a dog and a cat in the street_mask2.png" \
--edlora_paths "experiments/ED-LoRAs/real/chow_dog.pth" "experiments/ED-LoRAs/real/siberian_cat.pth" \
--start_seed 0 \
--batch_size 4 \
--n_batches 1```
You can also pass parameters using a configuration file (like ./configs/sample_config.yaml) :
```bash
python sample.py --config_file "path/to/your/config.yaml"
```**For SDXL 1.0 :**
```bash
python sample_sdxl.py \
--ref_prompt "A cat on a stool and a dog on the floor." \
--base_prompt "A cat on a stool and a dog on the floor." \
--custom_prompts "A on a stool and a on the floor." "A on a stool and a on the floor."\
--ref_image_path "examples/a cat on a stool and a dog on the floor.png" \
--ref_mask_paths "examples/a cat on a stool and a dog on the floor_mask1.png" "examples/a cat on a stool and a dog on the floor_mask2.png" \
--edlora_paths "experiments/SDXL_ED-LoRAs/real/siberian_cat.pth" "experiments/SDXL_ED-LoRAs/real/chow_dog.pth" \
--start_seed 0 \
--batch_size 1 \
--n_batches 4```
You can also pass parameters using a configuration file (like ./configs/sample_config_sdxl.yaml) :
```bash
python sample_sdxl.py --config_file "path/to/your/config.yaml"
```## ✅ To-Do List
- [ ] Create a gradio demo.
- [ ] Add more usage and applications.
- [x] Add support for SDXL.
- [x] Release the training data and trained models.
- [x] Release the source code.## 📚 Citation
If you find this code useful for your research, please consider citing:
```
@article{yao2024concept,
title={Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis},
author={Yao, Zebin and Feng, Fangxiang and Li, Ruifan and Wang, Xiaojie},
journal={arXiv preprint arXiv:2408.03632},
year={2024}
}
```