https://github.com/cai991108/sd3m-simpletuner-lora

This repository contains the code and resources for the project "Fine-Tuning Stable Diffusion 3 Medium with SimpleTuner", which focuses on enhancing generative art by fine-tuning the Stable Diffusion 3 Medium (SD3M) model using the SimpleTuner toolkit.
https://github.com/cai991108/sd3m-simpletuner-lora

fine-tuning generative-art lora sd3 stability-ai stable-diffusion stable-diffusion-3 text2image

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/cai991108/sd3m-simpletuner-lora
Owner: CAI991108
License: mit
Created: 2025-01-03T13:57:27.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-01-04T05:57:06.000Z (5 months ago)
Last Synced: 2025-02-23T02:44:49.440Z (3 months ago)
Topics: fine-tuning, generative-art, lora, sd3, stability-ai, stable-diffusion, stable-diffusion-3, text2image
Language: Python
Homepage: https://huggingface.co/jimchoi/simpletuner-lora
Size: 17.2 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Fine-Tuning Stable Diffusion 3 Medium with SimpleTuner

## Project Overview

This project explores the potential of fine-tuning the **Stable Diffusion 3 Medium (SD3M)** model using the **SimpleTuner** toolkit to generate higher quality and more diverse 2D concept art. The research focuses on understanding the technical differences between fine-tuning SD1.5/SDXL and SD3M, and leverages these insights to develop a robust fine-tuning methodology that can be applied across various artistic domains.

## Key Features

- **Fine-Tuning SD3M**: We fine-tune the SD3M model using the SimpleTuner toolkit, which simplifies the process of customizing and fine-tuning pre-trained models.
- **Low Rank Adaptation (LoRA)**: We employ LoRA, a parameter-efficient fine-tuning technique, to adapt the model with limited computational resources.
- **Evaluation of Generative Art**: We evaluate the fine-tuned model's ability to generate high-quality and diverse concept art, comparing it with the baseline SD3M model.

## Installation

### Prerequisites

- Python 3.10 or 3.11
- Poetry (for dependency management)

### Setup
The [SimpleTuner SD3 Quickstart Guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/SD3.md)
provides a detailed guide on setting up the environment and fine-tuning the SD3M model using SimpleTuner.

Additionally, you might find the following instructions helpful:
1. There are built-in scripts to install the dependencies and set up the environment. (see files `poetry.lock` and `pyproject.toml`).
The following `tool.poetry.source` will accelerate the installation process for most cases. **Or simply use the file provided in this repository**.
```bash
name = "mirrors"
url = "https://to/your/mirror"
priority = "primary"
```
2. Choose wisely for the network connection. The suggestion is to `export` the following environment variables:
```bash
export HF_ENDPOINT=https://huggingface.com
export WANDB_API_KEY=your_wandb_api_key_here
```
3. The experimental script `configure.py` provides interactive step-by-step configuration. However, one should manually modify the `multibackend.json`:
```bash
"cache_dir_vae": "cache/vae/sd3/xxx",
"instance_data_dir": "path/to/your/data",
"cache_dir": "/cache/text/sd3/xxx",
```

## Example Usage of Generative Concept Art

### John Singer Sargent Style Portrait
An example fine-tuned model mincing the art style of John Singer Sargent. The model was trained on a dataset of John Singer Sargent's portraits and can generate images in a similar style.
please refer to my HuggingFace model card for detail implementation [jimchoi/simpletuner-lora](https://huggingface.co/jimchoi/simpletuner-lora)

1. **Run the inference script**:
```bash
python image_finetune.py --prompt "your_prompt_here"
```

2. **Adjust the prompt**:
- For better results, adjust the prompt content to avoid overfitting or generating overly similar images.

## Research Limitations

- **Dataset Size and Diversity**: The dataset used for fine-tuning is relatively small, which may limit the model's generalizability.
- **Computational Resources**: Limited GPU resources constrained the batch size and training speed.
- **Evaluation Techniques**: The evaluation is primarily based on visual inspection, and more rigorous quantitative metrics could be applied.

## Future Work

- **Expand Dataset**: Use a larger and more diverse dataset to improve the model's generalization capabilities.
- **Multi-GPU Training**: Utilize multi-GPU environments to speed up the training process and handle larger datasets.
- **Quantitative Evaluation**: Implement more rigorous quantitative evaluation metrics for a comprehensive assessment of the model's performance.

## License
The project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.

## Acknowledgments

- **Stability AI** for open-sourcing the SD3M model. The pre-trained SD3M model is available on Hugging Face: [stabilityai/stable-diffusion-3-medium-diffusers](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers).
- **SimpleTuner** for providing an efficient fine-tuning toolkit. For more details on using SimpleTuner with SD3M, refer to the [SimpleTuner SD3 Quickstart Guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/SD3.md).

---

For more details, please refer to the [report](report.pdf) and the Hugging Face model card [jimchoi/simpletuner-lora](https://huggingface.co/jimchoi/simpletuner-lora).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cai991108/sd3m-simpletuner-lora

Awesome Lists containing this project

README