{"id":17226594,"url":"https://github.com/xiuyu-li/q-diffusion","last_synced_at":"2025-04-06T17:13:49.570Z","repository":{"id":153876765,"uuid":"618509061","full_name":"Xiuyu-Li/q-diffusion","owner":"Xiuyu-Li","description":"[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.","archived":false,"fork":false,"pushed_at":"2024-03-21T05:50:44.000Z","size":6256,"stargazers_count":347,"open_issues_count":23,"forks_count":24,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-30T15:11:10.305Z","etag":null,"topics":["ddim","diffusion-models","model-compression","post-training-quantization","pytorch","quantization","stable-diffusion"],"latest_commit_sha":null,"homepage":"https://xiuyuli.com/qdiffusion/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xiuyu-Li.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-24T16:11:01.000Z","updated_at":"2025-03-21T15:53:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"5975bb9c-5370-4a67-8c2d-1d44094c15cf","html_url":"https://github.com/Xiuyu-Li/q-diffusion","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xiuyu-Li%2Fq-diffusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xiuyu-Li%2Fq-diffusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xiuyu-Li%2Fq-diffusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xiuyu-Li%2Fq-diffusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xiuyu-Li","download_url":"https://codeload.github.com/Xiuyu-Li/q-diffusion/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247517922,"owners_count":20951719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ddim","diffusion-models","model-compression","post-training-quantization","pytorch","quantization","stable-diffusion"],"created_at":"2024-10-15T04:16:41.420Z","updated_at":"2025-04-06T17:13:49.551Z","avatar_url":"https://github.com/Xiuyu-Li.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Q-Diffusion: Quantizing Diffusion Models [[website](https://xiuyuli.com/qdiffusion/)] [[paper](http://arxiv.org/abs/2302.04304)]\n**[NEW!]** Q-Diffusion is featured by NVIDIA [TensorRT](https://developer.nvidia.com/blog/tensorrt-accelerates-stable-diffusion-nearly-2x-faster-with-8-bit-post-training-quantization/)! Check out the official [example](https://github.com/NVIDIA/TensorRT/tree/release/9.3/demo/Diffusion#faster-text-to-image-using-sdxl--int8-quantization-using-ammo). \n\nQ-diffusion is able to quantize full-precision unconditional diffusion models into 4-bit while maintaining comparable performance (small FID change of at most 2.34 compared to \u003e100 for traditional PTQ) in a training-free manner.\n![example_lsun](assets/example_lsun.png)\n\nOur approach can also be plugged into text-guided image generation, where we run stable diffusion in 4-bit weights and achieve high generation quality for the first time.\n![example_sd](assets/example_sd.png)\n\n*This repository provides the official implementation for Q-Diffusion with calibrated (simulated) quantized checkpoints.*\n\n## Overview\n\n![teaser](assets/teaser.png)  \nDiffusion models have achieved significant success in image synthesis by iteratively estimating noise using deep neural networks. However, the slow inference and the memory and computational intensity of the noise estimation model hinder the efficient implementation of diffusion models. Although post-training quantization (PTQ) is considered a go-to compression method for other tasks, it does not work seamlessly with diffusion models. We propose a novel PTQ method specifically designed for the unique multi-timestep pipeline and model architecture of diffusion models, which compresses the noise estimation network to accelerate the generation process. We identify the primary challenge of diffusion model quantization as the changing output distributions of noise estimation networks over multiple time steps and the bimodal activation distribution of the shortcut layers within the noise estimation network. We address these challenges with timestep-aware calibration and split shortcut quantization in this work.\n## Getting Started\n\n### Installation\n\nClone this repository, and then create and activate a suitable conda environment named `qdiff` by using the following command:\n\n```bash\ngit clone https://github.com/Xiuyu-Li/q-diffusion.git\ncd q-diffusion\nconda env create -f environment.yml\nconda activate qdiff\n```\n\n### Usage\n\n1. For Latent Diffusion and Stable Diffusion experiments, first download relvant checkpoints following the instructions in the [latent-diffusion](https://github.com/CompVis/latent-diffusion#model-zoo) and [stable-diffusion](https://github.com/CompVis/stable-diffusion#weights) repos from CompVis. We currently use `sd-v1-4.ckpt` for Stable Diffusion. \n\n2. Download quantized checkpoints from the Google Drive [[link](https://drive.google.com/drive/folders/1ImRbmAvzCsU6AOaXbIeI7-4Gu2_Scc-X?usp=share_link)]. The checkpoints quantized with 4/8-bit weights-only quantization are the same as the ones with 4/8-bit weights and 8-bit activations quantization. \n\n3. Then use the following commands to run inference scripts with quantized checkpoints:\n\n```bash\n# CIFAR-10 (DDIM)\n# 4/8-bit weights-only\npython scripts/sample_diffusion_ddim.py --config configs/cifar10.yml --use_pretrained --timesteps 100 --eta 0 --skip_type quad --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --split --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n# 4/8-bit weights, 8-bit activations\npython scripts/sample_diffusion_ddim.py --config configs/cifar10.yml --use_pretrained --timesteps 100 --eta 0 --skip_type quad --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --quant_act --act_bit 8 --a_sym --split --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n\n# LSUN Bedroom (LDM-4)\n# 4/8-bit weights-only\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_beds256/model.ckpt -n 20 --batch_size 10 -c 200 -e 1.0 --seed 41 --ptq --weight_bit \u003c4 or 8\u003e --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n# 4/8-bit weights, 8-bit activations\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_beds256/model.ckpt -n 20 --batch_size 10 -c 200 -e 1.0 --seed 41 --ptq --weight_bit \u003c4 or 8\u003e --quant_act --act_bit 8 --a_sym --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n\n# LSUN Church (LDM-8)\n# 4/8-bit weights-only\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_churches256/model.ckpt -n 20 --batch_size 10 -c 400 -e 0.0 --seed 41 --ptq --weight_bit \u003c4 or 8\u003e --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n# 4/8-bit weights, 8-bit activations\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_churches256/model.ckpt -n 20 --batch_size 10 -c 400 -e 0.0 --seed 41 --ptq --weight_bit \u003c4 or 8\u003e --quant_act --act_bit 8 --resume -l \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n\n# Stable Diffusion\n# 4/8-bit weights-only\npython scripts/txt2img.py --prompt \u003cprompt. e.g. \"a puppy wearing a hat\"\u003e --plms --cond --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --no_grad_ckpt --split --n_samples 5 --resume --outdir \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n# 4/8-bit weights, 8-bit activations (with 16-bit for attention matrices after softmax)\npython scripts/txt2img.py --prompt \u003cprompt. e.g. \"a puppy wearing a hat\"\u003e --plms --cond --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --no_grad_ckpt --split --n_samples 5 --resume --quant_act --act_bit 8 --sm_abit 16 --outdir \u003coutput_path\u003e --cali_ckpt \u003cquantized_ckpt_path\u003e\n```\n\n### Calibration\nTo conduct the calibration process, you must first generate the corresponding calibration datasets. We provide some example calibration datasets [here](https://drive.google.com/drive/folders/12TVeziKWNz_HmTAIxQLDZlHE33PKdpb1?usp=sharing). These datasets contain around 1000-2000 samples of intermediate outputs at each time step, which are much more than sufficient for calibration purposes. We will soon upload smaller subsets that meet the minimum requirements for calibration. In the meantime, you may consider generating the calibration datasets yourself by following the procedures described in the paper.\n\nTo reproduce the calibrated checkpoints, you can use the following commands:\n\n```bash\n# CIFAR-10 (DDIM)\npython scripts/sample_diffusion_ddim.py --config configs/cifar10.yml --use_pretrained --timesteps 100 --eta 0 --skip_type quad --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --cali_st 20 --cali_batch_size 32 --cali_n 256 --quant_act --act_bit 8 --a_sym --split --cali_data_path \u003ccali_data_path\u003e -l \u003coutput_path\u003e\n\n# LSUN Bedroom (LDM-4)\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_beds256/model.ckpt -n 50000 --batch_size 10 -c 200 -e 1.0  --seed 40 --ptq  --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --cali_st 20 --cali_batch_size 32 --cali_n 256 --quant_act --act_bit 8 --a_sym --a_min_max --running_stat --cali_data_path \u003ccali_data_path\u003e -l \u003coutput_path\u003e\n\n# LSUN Church (LDM-8)\npython scripts/sample_diffusion_ldm.py -r models/ldm/lsun_churches256/model.ckpt -n 50000 --batch_size 10 -c 400 -e 0.0 --seed 40 --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --cali_st 20 --cali_batch_size 32 --cali_n 256 --quant_act --act_bit 8 --cali_data_path \u003ccali_data_path\u003e -l \u003coutput_path\u003e\n\n# Stable Diffusion\npython scripts/txt2img.py --prompt \"a photograph of an astronaut riding a horse\" --plms --cond --ptq --weight_bit \u003c4 or 8\u003e --quant_mode qdiff --quant_act --act_bit 8 --cali_st 25 --cali_batch_size 8 --cali_n 128 --no_grad_ckpt --split --running_stat --sm_abit 16 --cali_data_path \u003ccali_data_path\u003e --outdir \u003coutput_path\u003e\n```\nNote that using different hyperparameters for calibration may result in slightly different performance.\n\n## Citation\n\nIf you find this work useful in your research, please consider citing our paper:\n\n```bibtex\n@InProceedings{li2023qdiffusion,\n  author={Li, Xiuyu and Liu, Yijiang and Lian, Long and Yang, Huanrui and Dong, Zhen and Kang, Daniel and Zhang, Shanghang and Keutzer, Kurt},\n  title={Q-Diffusion: Quantizing Diffusion Models},\n  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},\n  month={October},\n  year={2023},\n  pages={17535-17545}\n}\n```\n\n## Acknowledgments\n\nOur code was developed based on [ddim](https://github.com/ermongroup/ddim), [latent-diffusion](https://github.com/CompVis/latent-diffusion) and [stable-diffusion](https://github.com/CompVis/stable-diffusion). We referred to [BRECQ](https://github.com/yhhhli/BRECQ) for the blockwise calibration implementation.\n\nWe thank [DeepSpeed](https://github.com/microsoft/DeepSpeed) for model sizes and BOPS measurement and [torch-fidelity](https://github.com/toshas/torch-fidelity) for IS and FID computation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiuyu-li%2Fq-diffusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxiuyu-li%2Fq-diffusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiuyu-li%2Fq-diffusion/lists"}