Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mountchicken/structured_dreambooth_lora

Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.
https://github.com/mountchicken/structured_dreambooth_lora

diffusers dreambooth lora stable-diffusion

Last synced: 18 days ago
JSON representation

Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.

Host: GitHub
URL: https://github.com/mountchicken/structured_dreambooth_lora
Owner: Mountchicken
License: mit
Created: 2023-05-15T08:39:24.000Z (over 1 year ago)
Default Branch: master
Last Pushed: 2023-05-18T09:28:09.000Z (over 1 year ago)
Last Synced: 2024-10-06T11:20:52.247Z (about 1 month ago)
Topics: diffusers, dreambooth, lora, stable-diffusion
Language: Python
Homepage:
Size: 9.89 MB
Stars: 12
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Structured DreamBooth with LoRA

## 1.Introduction
- This is a naive adaption of [DreamBooth_LoRA by Hugging Face🤗](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py) with the following modifications:
- Structured code: We re-structured the original code into different modules, including `models`, `datasets`, `engines`, `tools`, `utils`, to make it more readable and maintainable, and can be easily extended to other tasks.
- Detailed comments: We added detailed comments to the code to make it easier to understand

![imgs](github/image1.png)

## 2.Installation
- Install dependencies
```bash
conda create -n dreambooth python=3.8
conda activate dreambooth
# install pytorch
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
# install diffusers from source
pip install git+https://github.com/huggingface/diffusers
pip install -r requirements.txt
```

## 3.Training
### 3.1. Train with default settings (Recommended)
- This will train the model with default settings, including 512x512 resolution, 8GB GPU memory occupied, 1 image per batch, 1 gradient accumulation step, 2e-4 learning rate, 150 training steps, 4 validation epochs. We find this setting is enough to generate high-quality images.
- `Step1`: Prepare your custom images and put them in a folder. Normally, 5 to 10 images are enough. We recommend you to mannuallly crop the images to the same size, e.g., 512x512, to avoid unwanted artifacts.
- `Step2`: Initialize a Accelerate environment. [Accelerate](https://huggingface.co/docs/accelerate) is a PyTorch library that simplifies the process of launching multi-GPU training and evaluation jobs. It is developed by Hugging Face.
```bash
accelerate config
```
- `Step3`: Run the training script. Both checkpoints and samples will be saved in the `work_dirs` folder. **Normally, it only takes 1-2 minutes to fine-tune the model with only 8GB GPU memoroccupied**. ***150 epochs are enough to train a object, however, when training on human face, we recommend you to train for 800 epochs. The hyper-parameters of Dreambooth is quite sensitive, you can refer to the [original blog](https://huggingface.co/blog/dreambooth) for some insights.***
```bash
accelerate launch main.py \
--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
--instance_data_dir="imgs/dogs" \
--instance_prompt="a photo of sks dog" \
--validation_prompt="a photo of sks dog is swimming" \
--with_prior_preservation \
--class_prompt=='a photo of dog' \
--resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=2e-4 \
--max_train_steps=150 \
--validation_epochs 4
```

### 3.2. Training with prior-preserving loss
Prior preservation is used to avoid overfitting and language-drift (check out the [paper](https://arxiv.org/abs/2208.12242) to learn more if you’re interested). For prior preservation, you use other images of the same class as part of the training process. The nice thing is that you can generate those images using the Stable Diffusion model itself! The training script will save the generated images to a local path you specify.
```bash
accelerate launch main.py \
--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
--instance_data_dir="imgs/dogs" \
--instance_prompt="a photo of sks dog" \
--validation_prompt="a photo of sks dog is swimming" \
--resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=2e-4 \
--max_train_steps=150 \
--validation_epochs 10
```

### 3.3. Training with the text encoder (Not Recommended)
You can aslo fine-tune the text encoder (CLIP) with LoRA. However we find this leads to unconverged results. This phenomenon is opposite to the results reported in the [Original Implementation](https://huggingface.co/blog/dreambooth)
```bash
accelerate launch main.py \
--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
--instance_data_dir="imgs/dogs" \
--instance_prompt="a photo of sks dog" \
--validation_prompt="a photo of sks dog is swimming" \
--with_prior_preservation \
--train_text_encoder \
--class_prompt=='a photo of dog' \
--resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=2e-4 \
--max_train_steps=150 \
--validation_epochs 4
```

## 4.Inference
After training, you can use the following command to generate images from a prompt. We also provide a pretrained checkpoint for dog (in the example)
```bash
wget https://github.com/Mountchicken/Structured_Dreambooth_LoRA/releases/download/checkpoint_dog/checkpoint-200.zip
unzip -q checkpoint-200.zip
```
```bash
accelerate launch main.py \
--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
--checkpoint_dir="checkpoint-200" \
--prompt="A photo of sks dog is swimming \
--output_dir=$OUTPUT_DIR
```

## Aknowledgements
- [Training Stable Diffusion with Dreambooth using 🧨 Diffusers](https://huggingface.co/blog/dreambooth)
- [diffusers/examples/dreambooth/train_dreambooth_lora.py](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py)