Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/saladtechnologies/dreambooth
A docker container for training dreambooth LoRAs, with automatic checkpointing and resuming to s3-compatible storage
https://github.com/saladtechnologies/dreambooth
Last synced: 6 days ago
JSON representation
A docker container for training dreambooth LoRAs, with automatic checkpointing and resuming to s3-compatible storage
- Host: GitHub
- URL: https://github.com/saladtechnologies/dreambooth
- Owner: SaladTechnologies
- License: mit
- Created: 2024-02-09T13:57:57.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-05-02T16:21:52.000Z (7 months ago)
- Last Synced: 2024-05-03T02:44:54.719Z (7 months ago)
- Language: Python
- Size: 57.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# Dreambooth LoRA Training
## Train on Salad
In this repo, there's a script `train_on_salad.py`. You can customize this script to easily run SDXL dreambooth LoRA training jobs on Salad.
## Environment Variables
| Variable Name | Default Value | Description |
| ---------------------------- | ---------------------------------------- | ----------------------------------------- |
| LOG_LEVEL | INFO | Log level configuration |
| MODEL_NAME | stabilityai/stable-diffusion-xl-base-1.0 | Huggingface Hub Model Name or Path |
| INSTANCE_DIR | /images | Directory where training data is stored |
| OUTPUT_DIR | /output | Directory where training output is stored |
| VAE_PATH | madebyollin/sdxl-vae-fp16-fix | VAE model name or path |
| PROMPT | photo of timberdog | Prompt for training |
| DREAMBOOTH_SCRIPT | train_dreambooth_lora_sdxl.py | Dreambooth training script path |
| RESOLUTION | 1024 | Resolution of the images |
| MAX_TRAIN_STEPS | 500 | Total number of training steps |
| CHECKPOINTING_STEPS | 50 | Save a checkpoint after every N steps |
| LEARNING_RATE | 1e-4 | Learning rate |
| GRADIENT_ACCUMULATION_STEPS | 4 | Gradient accumulation steps |
| LR_WARMUP_STEPS | 0 | LR warmup steps |
| MIXED_PRECISION | fp16 | Mixed precision training |
| TRAIN_BATCH_SIZE | 1 | Train batch size |
| LR_SCHEDULER | constant | Learning rate scheduler |
| USE_8BIT_ADAM | None | Use 8-bit adam |
| TRAIN_TEXT_ENCODER | None | Train text encoder |
| GRADIENT_CHECKPOINTING | None | Gradient checkpointing |
| WITH_PRIOR_PRESERVATION | None | With prior preservation |
| PRIOR_LOSS_WEIGHT | 1.0 | Prior loss weight |
| CHECKPOINT_BUCKET_NAME | None | S3 bucket name for storing checkpoints |
| CHECKPOINT_BUCKET_PREFIX | None | Prefix for storing checkpoints in S3 |
| DATA_BUCKET_NAME | None | S3 bucket name for storing training data |
| DATA_BUCKET_PREFIX | None | Prefix for storing training data in S3 |
| WEBHOOK_URL | None | Webhook URL |
| PROGRESS_WEBHOOK_URL | None | Webhook URL for progress |
| COMPLETE_WEBHOOK_URL | None | Webhook URL for completion |
| WEBHOOK_AUTH_HEADER | None | Authentication header for webhook |
| PROGRESS_WEBHOOK_AUTH_HEADER | None | Auth header for progress webhook |
| COMPLETE_WEBHOOK_AUTH_HEADER | None | Auth header for completion webhook |
| WEBHOOK_AUTH_VALUE | None | Authentication value for webhook |
| PROGRESS_WEBHOOK_AUTH_VALUE | None | Auth value for progress webhook |
| COMPLETE_WEBHOOK_AUTH_VALUE | None | Auth value for completion webhook |
| SALAD_MACHINE_ID | None | Salad Machine ID |
| SALAD_CONTAINER_GROUP_ID | None | Container Group ID for Salad |
| SALAD_CONTAINER_GROUP_NAME | None | Container Group name for Salad |
| SALAD_ORGANIZATION_NAME | None | Organization name for Salad |
| SALAD_PROJECT_NAME | None | Project name for Salad |Additonally, if using s3-compatible storage for checkpointing, you will need to provide AWS configuration environment variables as well.