https://github.com/Lingzhi-Pan/PILOT
Official Implement of the work "Coherent and Multi-modality Image Inpainting via Latent Space Optimization"
https://github.com/Lingzhi-Pan/PILOT
Last synced: 3 months ago
JSON representation
Official Implement of the work "Coherent and Multi-modality Image Inpainting via Latent Space Optimization"
- Host: GitHub
- URL: https://github.com/Lingzhi-Pan/PILOT
- Owner: Lingzhi-Pan
- License: mit
- Created: 2024-06-08T17:31:46.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-21T02:08:17.000Z (10 months ago)
- Last Synced: 2024-08-21T03:24:28.168Z (10 months ago)
- Language: Python
- Homepage:
- Size: 14.7 MB
- Stars: 33
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
#
PILOT: Coherent and Multi-modality Image Inpainting via Latent Space Optimization---
Official Implement of PILOT.
[Lingzhi Pan](https://github.com/Lingzhi-Pan), [Tong Zhang](https://people.epfl.ch/tong.zhang?lang=en), [Bingyuan Chen](https://github.com/Alex-Lord), [Qi Zhou](https://github.com/zaqai), [Wei Ke](https://gr.xjtu.edu.cn/en/web/wei.ke), [Sabine Susstrunk](https://people.epfl.ch/sabine.susstrunk), [Mathieu Salzmann](https://people.epfl.ch/mathieu.salzmann)

## Method Overview

## Getting Started
It is recommended to create and use a Torch virtual environment, such as conda. Next, download the appropriate PyTorch version compatible with your CUDA devices, and install the required packages listed in requirements.txt.
```
git clone https://github.com/Lingzhi-Pan/PILOT.git
cd PILOT
conda create -n pilot python==3.9
conda activate pilot
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
```
You can download the `stable-diffusion-v1-5` model from the website "https://huggingface.co/runwayml/stable-diffusion-v1-5" and save it to your local path.## Run Examples
We provide three types of conditions to guide the inpainting process: text, spatial controls, and reference images. Each condition-control refers to a different configuration in the directory `configs/`.### Text-guided
Modify the `model_path` parameter in the config file to point to the directory where you saved your SD model, and then execute the following instruction:
```
python run_example.py --config_file configs/t2i_step50.yaml
```
### Text + Spatial Controls
To introduce spatial controls using ControlNet or T2I-Adapter, we offer options for both models, but we recommend using ControlNet. First, download the ControlNet checkpoint, such as ControlNet conditioned on Scribble images, published by Lvmin Zhang from the following link: https://huggingface.co/lllyasviel/sd-controlnet-scribble. Then, execute the instructions below:
```
python run_example.py --config_file configs/controlnet_step30.yaml
```
You can also download other ControlNet models published by Lvmin Zhang to enable inpainting with other conditions such as canny map, segmentation map, and normal map.
### Text + Reference Image
Download the checkpoint of IP-Adapter from the website "https://huggingface.co/h94/IP-Adapter", and then run the following instruction:
```
python run_example.py --config_file configs/ipa_step50.yaml
```
### Text + Spatial Controls + Reference Image
You can also use ControlNet and IP-Adapter together to achieve multi-condition controls:
```
python run_example.py --config_file configs/ipa_controlnet_step30.yaml
```
### Personalized Image Inpainting
You can also integrate LORA into the base model or replace the base model with other personalized Text-to-Image (T2I) models to achieve personalized image inpainting. For example, replacing the base model with a T2I model fine-tuned by DreamBooth using several photos of a cute dog can generate the dog inside the masked region while preserving the dog's identity effectively.
**See our [Paper](https://arxiv.org/abs/2407.08019) for more information!**
## BibTeX
If you find this work helpful, please consider citing:
```bibtex
@article{pan2024coherent,
title={Coherent and Multi-modality Image Inpainting via Latent Space Optimization},
author={Pan, Lingzhi and Zhang, Tong and Chen, Bingyuan and Zhou, Qi and Ke, Wei and S{\"u}sstrunk, Sabine and Salzmann, Mathieu},
journal={arXiv preprint arXiv:2407.08019},
year={2024}
}
```