https://github.com/YSerin/ZeCon
https://github.com/YSerin/ZeCon
Last synced: 6 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/YSerin/ZeCon
- Owner: YSerin
- License: mit
- Created: 2023-04-12T02:08:25.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-01-04T11:16:48.000Z (over 1 year ago)
- Last Synced: 2024-08-01T18:37:57.969Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 4.44 MB
- Stars: 57
- Watchers: 1
- Forks: 8
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
# Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer (ICCV 2023 Accepted)
[](https://arxiv.org/abs/2303.08622)
## Abstract
> Diffusion models have shown great promise in text-guided image style transfer, but there is a trade-off between style transformation and content preservation due to their stochastic nature. Existing methods require computationally expensive fine-tuning of diffusion models or additional neural network. To address this, here we propose a zero-shot contrastive loss for diffusion models that doesn't require additional fine-tuning or auxiliary networks. By leveraging patch-wise contrastive loss between generated samples and original image embeddings in the pre-trained diffusion model, our method can generate images with the same semantic content as the source image in a zero-shot manner. Our approach outperforms existing methods while preserving content and requiring no additional training, not only for image style transfer but also for image-to-image translation and manipulation. Our experimental results validate the effectiveness of our proposed method.## How to Use
### Environment setting
**Python** 3.8.5 \
**Torch** 1.11.0
```
$ conda env create -f environment.yml
$ conda activate zecon
```
Our source code relies on [blended diffusion](https://github.com/omriav/blended-diffusion).### Pre-trained model
Download the model weights trained on [imagenet](https://github.com/openai/guided-diffusion) and [ffhq](https://github.com/jychoi118/ilvr_adm) dataset, respectively.Create a folder ```'./ckpt/'``` and then place the downloaded weights into the folder.
### Image manipulation
In order to manipulate an image, run:
```
python main.py --output_path './results' --init_image './src_image/imagenet3.JPEG' --data 'imagenet' --prompt_tgt 'a sketch with crayon' --prompt_src 'Photo' \
--skip_timesteps 25 --timestep_respacing 50 --diffusion_type 'ddim_ddpm' --l_clip_global 0 --l_clip_global_patch 10000 --l_clip_dir 0 --l_clip_dir_patch 20000 \
--l_zecon 500 --l_mse 5000 --l_vgg 100 --patch_min 0.01 --patch_max 0.3
```
+ The path to the source image is given to the flag ```--init_image```
+ The flag ```--data``` indicates the pretrained diffusion model. If you manipulate face data, choose 'ffhq'.
+ The text prompt for the target style is given to the flag ```--prompt_tgt```
+ The text prompt for the style of the source image is given to the flag ```--prompt_src```
+ The flag ```--skip_timesteps``` indicates .
+ The flag ```--timestep_respacing``` indicates .
+ Diffusion sampling types are given to the flag ```--diffusion_type```. The first one is for the forward step, and the latter one is for the reverse step.+ To further modulate the style, you can increase the four bottom losses.
+ The flag ```--l_clip_global``` indicates the weight for CLIP global loss.
+ The flag ```--l_clip_global_patch``` indicates the weight for patch-based CLIP global loss.
+ The flag ```--l_clip_dir``` indicates the weight for CLIP directional loss.
+ The flag ```--l_clip_dir_patch``` indicates the weight for patch-based CLIP directional loss.
+ To further preserve the content, you can increase the three bottom losses.
+ The flag ```--l_zecon``` indicates the weight for ZeCon loss.
+ The flag ```--l_mse``` indicates the weight for MSE loss.
+ The flag ```--l_vgg``` indicates the weight for VGG loss.
+ ***Tips!*** You can refer to the Table 5 in the [paper](https://arxiv.org/pdf/2303.08622.pdf) for the weights of the losses.## BibTeX
```
@article{yang2023zero,
title={Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer},
author={Yang, Serin and Hwang, Hyunmin and Ye, Jong Chul},
journal={arXiv preprint arXiv:2303.08622},
year={2023}
}
```