https://github.com/deep-diver/lora-deployment
LoRA fine-tuned Stable Diffusion Deployment
https://github.com/deep-diver/lora-deployment
generative-ai huggingface-inference-endpoint serving stable-diffusion
Last synced: 5 months ago
JSON representation
LoRA fine-tuned Stable Diffusion Deployment
- Host: GitHub
- URL: https://github.com/deep-diver/lora-deployment
- Owner: deep-diver
- License: apache-2.0
- Created: 2023-02-14T04:25:22.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-15T23:38:33.000Z (over 2 years ago)
- Last Synced: 2025-03-31T00:51:14.811Z (6 months ago)
- Topics: generative-ai, huggingface-inference-endpoint, serving, stable-diffusion
- Language: Jupyter Notebook
- Homepage: https://deep-diver.github.io/LoRA-deployment/
- Size: 7.51 MB
- Stars: 31
- Watchers: 2
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LoRA-deployment
This repository demonstrates how to serve multiple [LoRA fine-tuned Stable Diffusions](https://huggingface.co/blog/lora) from 🧨 Diffusers library on Hugging Face Inference Endpoint. Since only few ~ MB of checkpoint is produced after finetuning with LoRA, we can switch different checkpoint for different fine-tuned Stable Diffusion in super quick, memory efficient, and disk space efficient ways.
For demonstration purpose, I have tested the following Hugging Face Model repositories which has LoRA fine-tuned checkpoint(`pytorch_lora_weights.bin
`):
- [ethan_ai](https://huggingface.co/taesiri/ethan_ai_lora)
- [noto-emoji](https://huggingface.co/kuotient/noto-emoji-finetuned-lora)
- [pokemon](https://huggingface.co/pcuenq/pokemon-lora)## Notebook
- [Pilot notebook](https://github.com/deep-diver/LoRA-deployment/blob/main/notebooks/pilot.ipynb): shows how to write and test a custom handler for Hugging Face Inference Endpoint in local or Colab environments
- [Inference notebook](https://github.com/deep-diver/LoRA-deployment/blob/main/notebooks/inference.ipynb): shows how to request inference to the custom handler deployed on Hugging Face Inference Endopint
- [Multi-workers inference notebook](https://github.com/deep-diver/LoRA-deployment/blob/main/notebooks/multiworker_inference.ipynb): shows how to run simultaneous requests to the custom handler deployed on Hugging Face Inference Endpoint in Colab environment## Custom Handler
- [handler.py](https://github.com/deep-diver/LoRA-deployment/blob/main/custom_handler/handler.py): basic handler. This custom handler is proved to work with [this Hugging Face Model repo](https://huggingface.co/chansung/LoRA-deployment)
- [multiworker_handler.py](https://github.com/deep-diver/LoRA-deployment/blob/main/custom_handler/multiworker_handler.py): advanced handler with multiple worker(Stable Diffusion) pool. This custom handler is proved to work with [this Hugging Face Model repo](https://huggingface.co/chansung/LoRA-deployment-multiworkers)## Script
- [inference.py](https://github.com/deep-diver/LoRA-deployment/blob/main/scripts/inference.py): standalone Python script to send requests to the custom handler deployed on Hugging Face Inference Endpoint
## Reference
- https://huggingface.co/blog/lora