https://github.com/redaigc/target-driven-distillation
Consistency Distillation with Target Timestep Selection and Decoupled Guidance
https://github.com/redaigc/target-driven-distillation
consistency-models diffusion-models flux lcm-lora lora sdxl-lightning stable-diffusion stable-video-diffusion
Last synced: about 1 year ago
JSON representation
Consistency Distillation with Target Timestep Selection and Decoupled Guidance
- Host: GitHub
- URL: https://github.com/redaigc/target-driven-distillation
- Owner: RedAIGC
- Created: 2024-08-21T09:27:03.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-04T06:57:05.000Z (over 1 year ago)
- Last Synced: 2025-04-07T11:04:50.473Z (about 1 year ago)
- Topics: consistency-models, diffusion-models, flux, lcm-lora, lora, sdxl-lightning, stable-diffusion, stable-video-diffusion
- Language: Python
- Homepage: https://redaigc.github.io/TDD/
- Size: 13.9 MB
- Stars: 76
- Watchers: 3
- Forks: 11
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ✨Target-Driven Distillation✨
[](https://arxiv.org/abs/2409.01347)
[](https://redaigc.github.io/TDD)
[](https://huggingface.co/RED-AIGC/TDD)
[](https://huggingface.co/spaces/RED-AIGC/FLUX-TDD-BETA)
[](https://huggingface.co/spaces/RED-AIGC/TDD)
[](https://huggingface.co/spaces/RED-AIGC/SVD-TDD)
Target-Driven Distillation (TDD) is a state-of-the-art consistency distillation model that largely accelerates the inference processes of diffusion models. Using its delicate strategies of *target timestep selection* and *decoupled guidance*, models distilled by TDD can generated highly detailed images with only a few steps.
Samples generated by TDD-distilled SDXL, with only 4--8 steps.
## News!
- **Jan. 4, 2025**: We have update codes with adv config for training on [FLUX](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train/FLUX)
- **Jan. 1, 2025**: Demo of FLUX-TDD(training with adv) is now available on Hugging Face [](https://huggingface.co/RED-AIGC/TDD)
- **Dec. 10, 2024**: Our paper is accepted by **AAAI25**!
- **Sept. 21, 2024**: Demo of FLUX-TDD-BETA(4-8-steps) is now available on Hugging Face [](https://huggingface.co/spaces/RED-AIGC/FLUX-TDD-BETA)
- **Sept. 20, 2024**: We have released codes for training on [FLUX](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train/FLUX) , Our 4-8-steps FLUX.1-dev-related LoRAs are coming soon!
- **Sept. 12, 2024**: Demos of TDD-SDXL and TDD-SVD are now available on Hugging Face [](https://huggingface.co/spaces/RED-AIGC/TDD)
[](https://huggingface.co/spaces/RED-AIGC/SVD-TDD). Give them a try!
- **Sept. 4, 2024**: Our detailed research paper is now on arXiv [](https://arxiv.org/abs/2409.01347).
- **Aug. 29, 2024**: We have released codes for training and inference, as well as the pretrained models both w/ and w/o adv, on SDXL.
- **Aug. 22, 2024**: Project launched.
## Demos
### Comparison with Previous Works(LCM, PCM, TCD). From the same seeds, our method(TDD) demonstrates advantages in both image complexity and clarity.
### Video samples generated by AnimateLCM-distilled (top) and TDD-distilled (bottom) SVD-xt 1.1, also with 4--8 steps.
https://github.com/user-attachments/assets/09fcfc83-fbb8-45da-8ecf-18fa11a6bf82
### Samples generated by TDD-distilled different base models, and by SDXL with different LoRA adapters or ControlNets.
## Usage
### Inference
- Clone this repository.
```shell
git clone https://github.com/RedAIGC/Target-Driven-Distillation.git
cd Target-Driven-Distillation
```
- FLUX Download pretrained models with the script below or from [](https://huggingface.co/RED-AIGC/TDD).
```python
from huggingface_hub import hf_hub_download
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.load_lora_weights(hf_hub_download("RED-AIGC/TDD", "FLUX.1-dev_tdd_lora_weights.safetensors"))
pipe.fuse_lora(lora_scale=0.125)
pipe.to("cuda")
image_flux = pipe(
prompt=[prompt],
generator=torch.Generator().manual_seed(int(3413)),
num_inference_steps=8,
guidance_scale=2.0,
height=1024,
width=1024,
max_sequence_length=256
).images[0]
```
- SDXL Download pretrained models with the script below or from [](https://huggingface.co/RED-AIGC/TDD).
```python
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="RedAIGC/TDD", filename="sdxl_tdd_lora_weights.safetensors", local_dir="./tdd_lora")
```
- Generate images.
```python
# !pip install opencv-python transformers accelerate
import torch
import diffusers
from diffusers import StableDiffusionXLPipeline
from tdd_scheduler import TDDScheduler
device = "cuda"
tdd_lora_path = "tdd_lora/sdxl_tdd_lora_weights.safetensors"
pipe = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16").to(device)
pipe.scheduler = TDDScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights(tdd_lora_path, adapter_name="accelerate")
pipe.fuse_lora()
prompt="A photo of a cat made of water."
image = pipe(
prompt=prompt,
num_inference_steps=4,
guidance_scale=1.7,
eta=0.2,
generator=torch.Generator(device=device).manual_seed(546237),
).images[0]
image.save("tdd.png")
```
### Training
See scripts under [train](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train).
## Introduction
Target-Driven Distillation (TDD) features three key designs, that differ from previous consistency distillation methods.
1. **TDD adopts a delicate selection strategy of target timesteps, increasing the training efficiency.** Specifically, it first chooses from a predefined set of equidistant denoising schedules (*e.g.* 4--8 steps), then adds a stochatic offset to accomodate non-deterministic sampling (*e.g.* $\gamma$-sampling).
2. **TDD utilizes decoupled guidances during training, making itself open to post-tuning on guidance scale during inference periods.** Specifically, it replaces a portion of the text conditions with unconditional (*i.e.* empty) prompts, in order to align with the standard training process using CFG.
3. **TDD can be optionally equipped with non-equidistant sampling and x0 clipping, enabling a more flexible and accurate way for image sampling.**
An overview of TDD. (a) The training process features target timestep selection and decoupled guidance. (b) The inference process can optionally adopt non-equidistant denoising schedules.
For further details of TDD, please refer to our paper: [](https://arxiv.org/abs/2409.01347).
## Acknowledgements
- Thanks [sdbds](https://github.com/sdbds) help us in the training FLUX, This allows us to distill FLUX with a larger batch size.
- Thanks [PSNbst](https://huggingface.co/PSNbst/PAseer-TDD-Accelerator) provide the compressed version of TDD, which is less than 20MB. Truly impressive.
- Thanks to the [PCM](https://github.com/G-U-N/Phased-Consistency-Model) PCM team for their ADV_loss support!
- Thanks to the [HuggingFace](https://github.com/huggingface) gradio team for their free GPU support!
## Concact, Collaboration, and Citation
If you have any questions about the code, please do not hesitate to contact me!
Email: polu@xiaohongshu.com
Email: wangcunzheng2000@163.com