https://github.com/open-mmlab/styleshot

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型，无需针对图片微调，即能生成高质量的个性风格化图片!
https://github.com/open-mmlab/styleshot

controllable-generation style-transfer text-to-image

Last synced: about 1 month ago
JSON representation

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型，无需针对图片微调，即能生成高质量的个性风格化图片!

Host: GitHub
URL: https://github.com/open-mmlab/styleshot
Owner: open-mmlab
License: mit
Created: 2024-07-01T05:08:12.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-02-11T11:00:48.000Z (4 months ago)
Last Synced: 2025-04-12T04:45:27.754Z (2 months ago)
Topics: controllable-generation, style-transfer, text-to-image
Language: Python
Homepage: https://styleshot.github.io/
Size: 97.1 MB
Stars: 372
Watchers: 4
Forks: 23
Open Issues: 26
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # ___***StyleShot: A SnapShot on Any Style***___



       

       

      

      



  



_**[Junyao Gao](https://jeoyal.github.io/home/), Yanchen Liu, [Yanan Sun](https://scholar.google.com/citations?hl=zh-CN&user=6TA1oPkAAAAJ)^‡, Yinhao Tang, [Yanhong Zeng](https://zengyh1900.github.io/), [Kai Chen*](https://chenkai.site/), [Cairong Zhao*](https://vill-lab.github.io/)**_





(* corresponding authors, ^‡ project leader)

From Tongji University and Shanghai AI lab.



## Abstract

In this paper, we show that, a good style representation is crucial and sufficient for generalized style transfer without test-time tuning.

We achieve this through constructing a style-aware encoder and a well-organized style dataset called StyleGallery.

With dedicated design for style learning, this style-aware encoder is trained to extract expressive style representation with decoupling training strategy, and StyleGallery enables the generalization ability.

We further employ a content-fusion encoder to enhance image-driven style transfer.

We highlight that, our approach, named StyleShot, is simple yet effective in mimicking various desired styles, i.e., 3D, flat, abstract or even fine-grained styles, without test-time tuning. Rigorous experiments validate that, StyleShot achieves superior performance across a wide range of styles compared to existing state-of-the-art methods.

![arch](assets/teasers.png)

## News

- [2024/8/29] 🔥 Thanks to @neverbiasu's contribution. StyleShot is now available on [ComfyUI](https://github.com/neverbiasu/ComfyUI-StyleShot).

- [2024/7/5] 🔥 We release [online demo](https://huggingface.co/spaces/nowsyn/StyleShot) in HuggingFace.

- [2024/7/3] 🔥 We release [StyleShot_lineart](https://huggingface.co/Gaojunyao/StyleShot_lineart), a version taking the lineart of content image as control.

- [2024/7/2] 🔥 We release the [paper](https://arxiv.org/abs/2407.01414).

- [2024/7/1] 🔥 We release the code, [checkpoint](https://huggingface.co/Gaojunyao/StyleShot), [project page](https://styleshot.github.io/) and [online demo](https://openxlab.org.cn/apps/detail/lianchen/StyleShot).

## Start

```

# install styleshot

git clone https://github.com/Jeoyal/StyleShot.git

cd StyleShot

# create conda env

conda create -n styleshot python==3.8

conda activate styleshot

pip install -r requirements.txt

# download the models

git lfs install

git clone https://huggingface.co/Gaojunyao/StyleShot

git clone https://huggingface.co/Gaojunyao/StyleShot_lineart

```

## Models

you can download our pretrained weight from [here](https://huggingface.co/Gaojunyao/StyleShot). To run the demo, you should also download the following models:

- [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)

- [T2I-Adapter Models](https://huggingface.co/TencentARC)

- [ControlNet models](https://huggingface.co/lllyasviel)

- [CLIP Model](https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K)

## Inference

For inference, you should download the pretrained weight and prepare your own reference style image or content image.

```

# run text-driven style transfer demo

python styleshot_text_driven_demo.py --style "{style_image_path}" --prompt "{prompt}" --output "{save_path}"

# run image-driven style transfer demo

python styleshot_image_driven_demo.py --style "{style_image_path}"  --content "{content_image_path}" --preprocessor "Contour" --prompt "{prompt}" --output "{save_path}"

# integrate styleshot with controlnet and t2i-adapter

python styleshot_t2i-adapter_demo.py --style "{style_image_path}"  --condition "{condtion_image_path}" --prompt "{prompt}" --output "{save_path}"

python styleshot_controlnet_demo.py --style "{style_image_path}"  --condition "{condtion_image_path}" --prompt "{prompt}" --output "{save_path}"

```

- [**styleshot_text_driven_demo**](styleshot_text_driven_demo.py): text-driven style transfer with reference style image and text prompt.





Text-driven style transfer visualization



- [**styleshot_image_driven_demo**](styleshot_image_driven_demo.py): image-driven style transfer with reference style image and content image.





 Image style transfer visualization



- [**styleshot_controlnet_demo**](styleshot_controlnet_demo.py), [**styleshot_t2i-adapter_demo**](styleshot_t2i-adapter_demo.py): integration with controlnet and t2i-adapter.

## Train

We employ a two-stage training strategy to train our StyleShot for better integration of content and style. For training data, you can use our training dataset [StyleGallery](#style_gallery) or make your own dataset into a json file.

```

# training stage-1, only training the style component.

accelerate launch --num_processes 8 --multi_gpu --mixed_precision "fp16" \

  tutorial_train_styleshot_stage_1.py \

  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5/" \

  --image_encoder_path="{image_encoder_path}" \

  --image_json_file="{data.json}" \

  --image_root_path="{image_path}" \

  --mixed_precision="fp16" \

  --resolution=512 \

  --train_batch_size=16 \

  --dataloader_num_workers=4 \

  --learning_rate=1e-04 \

  --weight_decay=0.01 \

  --output_dir="{output_dir}" \

  --save_steps=10000

# training stage-2, only training the content component.

accelerate launch --num_processes 8 --multi_gpu --mixed_precision "fp16" \

  tutorial_train_styleshot_stage_2.py \

  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5/" \

  --pretrained_ip_adapter_path="./pretrained_weight/ip.bin" \

  --pretrained_style_encoder_path="./pretrained_weight/style_aware_encoder.bin" \

  --image_encoder_path="{image_encoder_path}" \

  --image_json_file="{data.json}" \

  --image_root_path="{image_path}" \

  --mixed_precision="fp16" \

  --resolution=512 \

  --train_batch_size=16 \

  --dataloader_num_workers=4 \

  --learning_rate=1e-04 \

  --weight_decay=0.01 \

  --output_dir="{output_dir}" \

  --save_steps=10000

```

## StyleGallery

We have carefully curated a style-balanced dataset, called **StyleGallery**, with extensive diverse image styles drawn from publicly available datasets for training our StyleShot. 

To prepare our dataset StyleGallery, please refer to [tutorial](DATASET.md), or download json file from [here](https://drive.google.com/drive/folders/10T3t58rQKDmYOLschUYj0tzm6zuOngMd?usp=drive_link).

## StyleBench

To address the lack of a benchmark in reference-based stylized generation, we establish a style evaluation benchmark containing 40 content images and 73 distinct styles across 490 reference images.

## Disclaimer

This project strives to positively impact the domain of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. **The developers do not assume any responsibility for potential misuse by users.**

## Citation

If you find StyleShot useful for your research and applications, please cite using this BibTeX:

```bibtex

@article{gao2024styleshot,

  title={Styleshot: A snapshot on any style},

  author={Gao, Junyao and Liu, Yanchen and Sun, Yanan and Tang, Yinhao and Zeng, Yanhong and Chen, Kai and Zhao, Cairong},

  journal={arXiv preprint arXiv:2407.01414},

  year={2024}

}

```

## Acknowledgements

The code is built upon IP-Adapter.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/open-mmlab/styleshot

Awesome Lists containing this project

README