https://github.com/kangyeolk/Paint-by-Sketch
Stable Diffusion-based image manipulation method with a sketch and reference image
https://github.com/kangyeolk/Paint-by-Sketch
computer-vision diffusion-models image-editing image-generation image-manipulation pytorch stable-diffusion
Last synced: about 1 month ago
JSON representation
Stable Diffusion-based image manipulation method with a sketch and reference image
- Host: GitHub
- URL: https://github.com/kangyeolk/Paint-by-Sketch
- Owner: kangyeolk
- License: mit
- Created: 2022-12-13T12:57:10.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-23T12:45:25.000Z (about 2 years ago)
- Last Synced: 2024-08-01T18:40:04.526Z (9 months ago)
- Topics: computer-vision, diffusion-models, image-editing, image-generation, image-manipulation, pytorch, stable-diffusion
- Language: Python
- Homepage:
- Size: 30.1 MB
- Stars: 172
- Watchers: 8
- Forks: 9
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
# Paint-by-Sketch
### [Paper](https://arxiv.org/abs/2304.09748)
[Kangyeol Kim](https://kangyeolk.github.io/), [Sunghyun Park](https://psh01087.github.io/), [Junsoo Lee](https://ssuhan.github.io/) and [Jaegul Choo](https://sites.google.com/site/jaegulchoo/?pli=1).
### Teaser
### Multi-backgrounds
### Multi-references
## Abstract
>Recent remarkable improvements in large-scale text-to-image generative models have shown promising results in generating high-fidelity images. To further enhance editability and enable fine-grained generation, we introduce a multi-input-conditioned image composition model that incorporates a sketch as a novel modal, alongside a reference image. Thanks to the edge-level controllability using sketches, our method enables a user to edit or complete an image sub-part with a desired structure (i.e., sketch) and content (i.e., reference image). Our framework fine-tunes a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance. Albeit simple, this leads to wide opportunities to fulfill user needs for obtaining the in-demand images.Through extensive experiments, we demonstrate that our proposed method offers unique use cases for image manipulation, enabling user-driven modifications of arbitrary scenes.
>## Environment & Pre-trained models
### Dependancies
```
$ conda env create -f environment.yaml
$ conda activate paint_sketch
$ pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
$ pip install opencv-python==4.6.0.66 opencv-python-headless==4.6.0.66 matplotlib==3.2.2 streamlit==1.14.1 streamlit-drawable-canvas==0.9.2
$ pip install git+https://github.com/openai/CLIP.git
```### Download checkpoints
* [Google Drive](https://drive.google.com/file/d/1jCDamfsknQj-A28RMA6Blws8iQn0y1ez/view?usp=share_link)
* Place a downloaded file as below:
```
Paint-by-Sketch
pretrained_models/
model-modified-12channel.ckpt
models/
Cartoon_v1_aesthetic/
...
...
```## Data preparation
* Sketch extraction
```bash
bash preprocess_dataset/run_preprocess.sh
# e.g.,
# bash preprocess_dataset/run_preprocess.sh /home/nas2_userF/kangyeol/Project/webtoon2022/Paint-by-Sketch/samples 7
```* Result
```
IMAGE_ROOT
images/
000000.png
000001.png
...
sketch_bin/
000000.png
000001.png
...
sketch(Not used)/
...
...
```## Training
```bash
bash cartoon_train.sh# e.g,
# bash cartoon_train.sh 0,1 models/test configs/v1_aesthetic_sketch_image.yaml
```* You need to match the number of `gpu_ids` and the number of gpus in the `lightning.trainer.yaml` (`gpu_ids`=2,3 with '0,1' in config file).
## Demo
0. Running `streamlit` server
```bash
streamlit run demo/app.py --server.port=8507 --server.fileWatcherType none
```1. Upload the source image
![]()
2. Draw mask and sketch separately
![]()
* The 1st and 2nd canvases are panels where you can draw masks and sketches.
* In the 3rd canvas, you can view the drawn mask and sketch overlaid together.3. Upload a reference image.
![]()
* Select a image in the left panel.
* Click `Read Exemplar` button.
* Crop the image partially with bounding box.4. Inference and export
![]()
* Perform inference with the drawn mask, sketch, and the cropped image as conditions.
* You can adjust the `scale` and `sketch strength` in the left panel.
* ou can save images in grid format through `Export` button.## Issues
* If the screen size is not large enough and the canvas size changes, there will be misalignment in the drawn mask and sketch.
## Citation
```
@misc{kim2023referencebased,
title={Reference-based Image Composition with Sketch via Structure-aware Diffusion Model},
author={Kangyeol Kim and Sunghyun Park and Junsoo Lee and Jaegul Choo},
year={2023},
eprint={2304.09748},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```## License
The code in this repository is released under the MIT License.## Acknowledges
This code borrows heavily from [Stable Diffusion](https://github.com/CompVis/stable-diffusion) and [Paint-by-Example](https://github.com/Fantasy-Studio/Paint-by-Example).