Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/saladtechnologies/dreambooth

A docker container for training dreambooth LoRAs, with automatic checkpointing and resuming to s3-compatible storage
https://github.com/saladtechnologies/dreambooth

Last synced: 6 days ago
JSON representation

A docker container for training dreambooth LoRAs, with automatic checkpointing and resuming to s3-compatible storage

Awesome Lists containing this project

README

        

# Dreambooth LoRA Training

## Train on Salad

In this repo, there's a script `train_on_salad.py`. You can customize this script to easily run SDXL dreambooth LoRA training jobs on Salad.

## Environment Variables

| Variable Name | Default Value | Description |
| ---------------------------- | ---------------------------------------- | ----------------------------------------- |
| LOG_LEVEL | INFO | Log level configuration |
| MODEL_NAME | stabilityai/stable-diffusion-xl-base-1.0 | Huggingface Hub Model Name or Path |
| INSTANCE_DIR | /images | Directory where training data is stored |
| OUTPUT_DIR | /output | Directory where training output is stored |
| VAE_PATH | madebyollin/sdxl-vae-fp16-fix | VAE model name or path |
| PROMPT | photo of timberdog | Prompt for training |
| DREAMBOOTH_SCRIPT | train_dreambooth_lora_sdxl.py | Dreambooth training script path |
| RESOLUTION | 1024 | Resolution of the images |
| MAX_TRAIN_STEPS | 500 | Total number of training steps |
| CHECKPOINTING_STEPS | 50 | Save a checkpoint after every N steps |
| LEARNING_RATE | 1e-4 | Learning rate |
| GRADIENT_ACCUMULATION_STEPS | 4 | Gradient accumulation steps |
| LR_WARMUP_STEPS | 0 | LR warmup steps |
| MIXED_PRECISION | fp16 | Mixed precision training |
| TRAIN_BATCH_SIZE | 1 | Train batch size |
| LR_SCHEDULER | constant | Learning rate scheduler |
| USE_8BIT_ADAM | None | Use 8-bit adam |
| TRAIN_TEXT_ENCODER | None | Train text encoder |
| GRADIENT_CHECKPOINTING | None | Gradient checkpointing |
| WITH_PRIOR_PRESERVATION | None | With prior preservation |
| PRIOR_LOSS_WEIGHT | 1.0 | Prior loss weight |
| CHECKPOINT_BUCKET_NAME | None | S3 bucket name for storing checkpoints |
| CHECKPOINT_BUCKET_PREFIX | None | Prefix for storing checkpoints in S3 |
| DATA_BUCKET_NAME | None | S3 bucket name for storing training data |
| DATA_BUCKET_PREFIX | None | Prefix for storing training data in S3 |
| WEBHOOK_URL | None | Webhook URL |
| PROGRESS_WEBHOOK_URL | None | Webhook URL for progress |
| COMPLETE_WEBHOOK_URL | None | Webhook URL for completion |
| WEBHOOK_AUTH_HEADER | None | Authentication header for webhook |
| PROGRESS_WEBHOOK_AUTH_HEADER | None | Auth header for progress webhook |
| COMPLETE_WEBHOOK_AUTH_HEADER | None | Auth header for completion webhook |
| WEBHOOK_AUTH_VALUE | None | Authentication value for webhook |
| PROGRESS_WEBHOOK_AUTH_VALUE | None | Auth value for progress webhook |
| COMPLETE_WEBHOOK_AUTH_VALUE | None | Auth value for completion webhook |
| SALAD_MACHINE_ID | None | Salad Machine ID |
| SALAD_CONTAINER_GROUP_ID | None | Container Group ID for Salad |
| SALAD_CONTAINER_GROUP_NAME | None | Container Group name for Salad |
| SALAD_ORGANIZATION_NAME | None | Organization name for Salad |
| SALAD_PROJECT_NAME | None | Project name for Salad |

Additonally, if using s3-compatible storage for checkpointing, you will need to provide AWS configuration environment variables as well.