An open API service indexing awesome lists of open source software.

https://github.com/redaigc/target-driven-distillation

Consistency Distillation with Target Timestep Selection and Decoupled Guidance
https://github.com/redaigc/target-driven-distillation

consistency-models diffusion-models flux lcm-lora lora sdxl-lightning stable-diffusion stable-video-diffusion

Last synced: about 1 year ago
JSON representation

Consistency Distillation with Target Timestep Selection and Decoupled Guidance

Awesome Lists containing this project

README

          

# ✨Target-Driven Distillation✨

[![Arxiv](https://img.shields.io/badge/arXiv-2402.19159-b31b1b)](https://arxiv.org/abs/2409.01347)
[![Project page](https://img.shields.io/badge/Web-Project%20Page-green)](https://redaigc.github.io/TDD)
[![Hugging Face Model](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Model-purple)](https://huggingface.co/RED-AIGC/TDD)
[![Hugging Face Space SDXL](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-FLUX_BETA-blue)](https://huggingface.co/spaces/RED-AIGC/FLUX-TDD-BETA)
[![Hugging Face Space SDXL](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-SDXL-blue)](https://huggingface.co/spaces/RED-AIGC/TDD)
[![Hugging Face Space SVD](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-SVD-blue)](https://huggingface.co/spaces/RED-AIGC/SVD-TDD)

Target-Driven Distillation (TDD) is a state-of-the-art consistency distillation model that largely accelerates the inference processes of diffusion models. Using its delicate strategies of *target timestep selection* and *decoupled guidance*, models distilled by TDD can generated highly detailed images with only a few steps.


teaser

Samples generated by TDD-distilled SDXL, with only 4--8 steps.

## News!

- **Jan. 4, 2025**: We have update codes with adv config for training on [FLUX](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train/FLUX)
- **Jan. 1, 2025**: Demo of FLUX-TDD(training with adv) is now available on Hugging Face [![Hugging Face Model](https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Model-purple)](https://huggingface.co/RED-AIGC/TDD)
- **Dec. 10, 2024**: Our paper is accepted by **AAAI25**!
- **Sept. 21, 2024**: Demo of FLUX-TDD-BETA(4-8-steps) is now available on Hugging Face [![Hugging Face Space FLUX](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-FLUX-blue)](https://huggingface.co/spaces/RED-AIGC/FLUX-TDD-BETA)
- **Sept. 20, 2024**: We have released codes for training on [FLUX](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train/FLUX) , Our 4-8-steps FLUX.1-dev-related LoRAs are coming soon!
- **Sept. 12, 2024**: Demos of TDD-SDXL and TDD-SVD are now available on Hugging Face [![Hugging Face Space SDXL](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-SDXL-blue)](https://huggingface.co/spaces/RED-AIGC/TDD)
[![Hugging Face Space SVD](https://img.shields.io/badge/%F0%9F%A4%97HF%20Space-SVD-blue)](https://huggingface.co/spaces/RED-AIGC/SVD-TDD). Give them a try!
- **Sept. 4, 2024**: Our detailed research paper is now on arXiv [![Arxiv](https://img.shields.io/badge/arXiv-2402.19159-b31b1b)](https://arxiv.org/abs/2409.01347).
- **Aug. 29, 2024**: We have released codes for training and inference, as well as the pretrained models both w/ and w/o adv, on SDXL.
- **Aug. 22, 2024**: Project launched.

## Demos

### Comparison with Previous Works(LCM, PCM, TCD). From the same seeds, our method(TDD) demonstrates advantages in both image complexity and clarity.


comparison

### Video samples generated by AnimateLCM-distilled (top) and TDD-distilled (bottom) SVD-xt 1.1, also with 4--8 steps.
https://github.com/user-attachments/assets/09fcfc83-fbb8-45da-8ecf-18fa11a6bf82

### Samples generated by TDD-distilled different base models, and by SDXL with different LoRA adapters or ControlNets.


other

## Usage

### Inference
- Clone this repository.
```shell
git clone https://github.com/RedAIGC/Target-Driven-Distillation.git
cd Target-Driven-Distillation
```

- FLUX Download pretrained models with the script below or from [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue)](https://huggingface.co/RED-AIGC/TDD).
```python
from huggingface_hub import hf_hub_download
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.load_lora_weights(hf_hub_download("RED-AIGC/TDD", "FLUX.1-dev_tdd_lora_weights.safetensors"))
pipe.fuse_lora(lora_scale=0.125)
pipe.to("cuda")

image_flux = pipe(
prompt=[prompt],
generator=torch.Generator().manual_seed(int(3413)),
num_inference_steps=8,
guidance_scale=2.0,
height=1024,
width=1024,
max_sequence_length=256
).images[0]
```

- SDXL Download pretrained models with the script below or from [![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue)](https://huggingface.co/RED-AIGC/TDD).
```python
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="RedAIGC/TDD", filename="sdxl_tdd_lora_weights.safetensors", local_dir="./tdd_lora")
```

- Generate images.
```python
# !pip install opencv-python transformers accelerate
import torch
import diffusers
from diffusers import StableDiffusionXLPipeline
from tdd_scheduler import TDDScheduler

device = "cuda"
tdd_lora_path = "tdd_lora/sdxl_tdd_lora_weights.safetensors"

pipe = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16").to(device)

pipe.scheduler = TDDScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights(tdd_lora_path, adapter_name="accelerate")
pipe.fuse_lora()

prompt="A photo of a cat made of water."

image = pipe(
prompt=prompt,
num_inference_steps=4,
guidance_scale=1.7,
eta=0.2,
generator=torch.Generator(device=device).manual_seed(546237),
).images[0]

image.save("tdd.png")
```

### Training

See scripts under [train](https://github.com/RedAIGC/Target-Driven-Distillation/tree/main/train).

## Introduction

Target-Driven Distillation (TDD) features three key designs, that differ from previous consistency distillation methods.
1. **TDD adopts a delicate selection strategy of target timesteps, increasing the training efficiency.** Specifically, it first chooses from a predefined set of equidistant denoising schedules (*e.g.* 4--8 steps), then adds a stochatic offset to accomodate non-deterministic sampling (*e.g.* $\gamma$-sampling).
2. **TDD utilizes decoupled guidances during training, making itself open to post-tuning on guidance scale during inference periods.** Specifically, it replaces a portion of the text conditions with unconditional (*i.e.* empty) prompts, in order to align with the standard training process using CFG.
3. **TDD can be optionally equipped with non-equidistant sampling and x0 clipping, enabling a more flexible and accurate way for image sampling.**


overview

An overview of TDD. (a) The training process features target timestep selection and decoupled guidance. (b) The inference process can optionally adopt non-equidistant denoising schedules.

For further details of TDD, please refer to our paper: [![Arxiv](https://img.shields.io/badge/arXiv-2402.19159-b31b1b)](https://arxiv.org/abs/2409.01347).

## Acknowledgements
- Thanks [sdbds](https://github.com/sdbds) help us in the training FLUX, This allows us to distill FLUX with a larger batch size.
- Thanks [PSNbst](https://huggingface.co/PSNbst/PAseer-TDD-Accelerator) provide the compressed version of TDD, which is less than 20MB. Truly impressive.
- Thanks to the [PCM](https://github.com/G-U-N/Phased-Consistency-Model) PCM team for their ADV_loss support!
- Thanks to the [HuggingFace](https://github.com/huggingface) gradio team for their free GPU support!

## Concact, Collaboration, and Citation![visitors](https://visitor-badge.laobi.icu/badge?page_id=RedAIGC.Target-Driven-Distillation)

If you have any questions about the code, please do not hesitate to contact me!

Email: polu@xiaohongshu.com
Email: wangcunzheng2000@163.com