{"id":89349,"url":"https://github.com/xlite-dev/awesome-dit-inference","name":"awesome-dit-inference","description":"📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉","projects_count":71,"last_synced_at":"2026-06-09T13:00:22.494Z","repository":{"id":246610044,"uuid":"743007730","full_name":"xlite-dev/Awesome-DiT-Inference","owner":"xlite-dev","description":"📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉","archived":false,"fork":false,"pushed_at":"2026-03-19T03:00:14.000Z","size":274,"stargazers_count":554,"open_issues_count":0,"forks_count":26,"subscribers_count":16,"default_branch":"main","last_synced_at":"2026-05-23T22:03:23.097Z","etag":null,"topics":["cogvideox","deepcache","diffusion","dit","flux","open-sora","open-sora-plan","sd15","sdxl","sora","stable-diffusion","wan"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xlite-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-01-14T02:49:21.000Z","updated_at":"2026-05-22T05:13:59.000Z","dependencies_parsed_at":"2026-01-20T12:00:55.761Z","dependency_job_id":null,"html_url":"https://github.com/xlite-dev/Awesome-DiT-Inference","commit_stats":null,"previous_names":["deftruth/awesome-sd-distributed-inference","deftruth/awesome-sd-inference","deftruth/awesome-diffusion-inference","xlite-dev/awesome-diffusion-inference","xlite-dev/awesome-dit-inference","xlite-dev/dit-infra"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/xlite-dev/Awesome-DiT-Inference","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2FAwesome-DiT-Inference","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2FAwesome-DiT-Inference/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2FAwesome-DiT-Inference/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2FAwesome-DiT-Inference/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xlite-dev","download_url":"https://codeload.github.com/xlite-dev/Awesome-DiT-Inference/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xlite-dev%2FAwesome-DiT-Inference/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34107866,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"created_at":"2025-06-12T05:31:33.295Z","updated_at":"2026-06-09T13:00:22.494Z","primary_language":null,"list_of_lists":false,"displayable":true,"categories":["📙 Sampling","📙 Parallelism","📙 Caching","📙 Quantization","📙 Attention","📖 News 🔥🔥"],"sub_categories":[],"readme":"\n\u003cdiv align='center'\u003e\n  \u003cimg src=https://github.com/user-attachments/assets/decc861e-fe77-4eb7-aacf-7d27b6430e54 width=250px\u003e\n\u003c/div\u003e   \n \n\u003cdiv align='center'\u003e\n  \u003cimg src=https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg \u003e\n  \u003cimg src=https://img.shields.io/github/forks/xlite-dev/Awesome-DiT-Inference.svg?style=social \u003e\n  \u003cimg src=https://img.shields.io/github/stars/xlite-dev/Awesome-DiT-Inference.svg?style=social \u003e\n  \u003cimg src=https://img.shields.io/badge/Release-v0.5-brightgreen.svg \u003e\n  \u003cimg src=https://img.shields.io/badge/License-GPLv3.0-turquoise.svg \u003e\n \u003c/div\u003e   \n\n📒A curated list of Awesome **Diffusion** Inference Papers with codes. For Awesome LLM Inference, please check 📖[Awesome-LLM-Inference](https://github.com/xlite-dev/Awesome-LLM-Inference)  ![](https://img.shields.io/github/stars/xlite-dev/Awesome-LLM-Inference.svg?style=social) for more details.\n\n## 📖 News 🔥🔥\n\u003cdiv id=\"news\"\u003e\u003c/div\u003e\n\n- [2026/03] Cache-DiT **[🎉v1.3.0](https://github.com/vipshop/cache-dit)** release is ready, the major updates including: [Ring](https://cache-dit.readthedocs.io/en/latest/user_guide/CONTEXT_PARALLEL) Attention w/ [batched P2P](https://cache-dit.readthedocs.io/en/latest/user_guide/CONTEXT_PARALLEL), [USP](https://cache-dit.readthedocs.io/en/latest/user_guide/CONTEXT_PARALLEL/) (Hybrid Ring and Ulysses), Hybrid 2D and 3D Parallelism (💥[USP + TP](https://cache-dit.readthedocs.io/en/latest/user_guide/HYBRID_PARALLEL/)),  VAE-P Comm overhead reduce.\n\n![arch](https://github.com/vipshop/cache-dit/raw/main/assets/arch_v2.png)\n\n## 🤖Contents\n\n- [📙Sampling](#Sampling)\n- [📙Caching](#Caching)\n- [📙Parallelism](#Parallelism)\n- [📙Quantization](#Quantization)\n- [📙Attention](#Attention)\n\n\n## ©️Citations \n\n```BibTeX\n@misc{Awesome-DiT-Inference@2024,\n  title={Awesome-DiT-Inference: A small curated list of Awesome Diffusion Inference.},\n  url={https://github.com/xlite-dev/Awesome-DiT-Inference},\n  note={Open-source software available at https://github.com/xlite-dev/Awesome-DiT-Inference},\n  author={xlite-dev},\n  year={2024}\n}\n```\n\n\n## 📙 Sampling  \n\n\u003cdiv id=\"Sampling\"\u003e\u003c/div\u003e  \n\n|Date|Title|Paper|Code|Recom|\n|:---:|:---:|:---:|:---:|:---:|   \n|2020.06| 🔥[**DDPM**] Denoising Diffusion Probabilistic Models(@UC Berkeley)  | [[pdf]](https://arxiv.org/abs/2006.11239) |[[diffusion]](https://github.com/hojonathanho/diffusion) ![](https://img.shields.io/github/stars/hojonathanho/diffusion.svg?style=social) |⭐️⭐️ | \n|2020.10| 🔥[**DDIM**] DENOISING DIFFUSION IMPLICIT MODELS(@cs.stanford.edu) | [[pdf]](https://arxiv.org/pdf/2010.02502) |⚠️|⭐️⭐️ |\n|2022.02| 🔥[**PNDM**] PSEUDO NUMERICAL METHODS FOR DIFFUSION MODELS ON MANIFOLDS(@) | [[pdf]](https://arxiv.org/pdf/2202.09778) |[[PNDM]](https://github.com/luping-liu/PNDM) ![](https://img.shields.io/github/stars/luping-liu/PNDM.svg?style=social) |⭐️⭐️ | \n|2022.02| 🔥[**DPM-Solver**] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps(@Cheng Lu) | [[pdf]](https://arxiv.org/pdf/2206.00927) |[[dpm-solver]](https://github.com/LuChengTHU/dpm-solver) ![](https://img.shields.io/github/stars/LuChengTHU/dpm-solver.svg?style=social) |⭐️⭐️ | \n|2022.11| 🔥[**DPM-Solver++**] DPM-SOLVER++: FAST SOLVER FOR GUIDED SAMPLING OF DIFFUSION PROBABILISTIC MODELS(@Cheng Lu) | [[pdf]](https://arxiv.org/pdf/2211.01095) |[[dpm-solver]](https://github.com/LuChengTHU/dpm-solver) ![](https://img.shields.io/github/stars/LuChengTHU/dpm-solver.svg?style=social) |⭐️⭐️ | \n|2023.10| 🔥[**DPM-Solver-v3**] DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics(@Kaiwen Zheng) | [[pdf]](https://arxiv.org/pdf/2310.13268) |[[DPM-Solver-v3]](https://github.com/thu-ml/DPM-Solver-v3) ![](https://img.shields.io/github/stars/thu-ml/DPM-Solver-v3.svg?style=social) |⭐️⭐️ | \n|2023.11| 🔥[**Parallel Sampling**] Parallel Sampling of Diffusion Models(@Stanford University) | [[pdf]](https://papers.nips.cc/paper_files/paper/2023/file/0d1986a61e30e5fa408c81216a616e20-Paper-Conference.pdf) | [[paradigms]](https://github.com/AndyShih12/paradigms) ![](https://img.shields.io/github/stars/AndyShih12/paradigms.svg?style=social) |⭐️⭐️ |\n|2023.11| 🔥[**SAMPLER SCHEDULER**] SAMPLER SCHEDULER FOR DIFFUSION MODELS(@sysu)| [[pdf]](https://arxiv.org/pdf/2311.06845) |⚠️|⭐️⭐️ |\n|2024.02| 🔥[**Parallel Sampling**] Accelerating Parallel Sampling of Diffusion Models(@Zhiwei Tang) | [[pdf]](https://arxiv.org/pdf/2402.09970) | [[ParaTAA-Diffusion]](https://github.com/TZW1998/ParaTAA-Diffusion) ![](https://img.shields.io/github/stars/TZW1998/ParaTAA-Diffusion.svg?style=social) |⭐️⭐️ | \n|2024.01| 🔥[**YONOS**] You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation(@Samsung AI) | [[pdf]](https://arxiv.org/pdf/2401.17258) |⚠️|⭐️⭐️ |\n|2024.01| 🔥[**S^2-DM**] S^2-DMs: Skip-Step Diffusion Models(@Yixuan Wang) | [[pdf]](https://arxiv.org/pdf/2401.01520) |⚠️|⭐️⭐️ |\n|2024.08| 🔥[**StepSaver**] StepSaver: Predicting Minimum Denoising Steps for Diffusion Model Image Generation(@intel) | [[pdf]](https://arxiv.org/pdf/2408.02054) |⚠️|⭐️⭐️ |\n|2024.09| 🔥[**DC-Solver**] DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation(@Tsinghua University)| [[pdf]](https://arxiv.org/pdf/2409.03755v1) | [[DC-Solver]](https://github.com/wl-zhao/DC-Solver) ![](https://img.shields.io/github/stars/wl-zhao/DC-Solver.svg?style=social) |⭐️⭐️ | \n\n## 📙 Caching  \n\n\u003cdiv id=\"Caching\"\u003e\u003c/div\u003e  \n\n- **UNet Based (DeepCache)**\n\n\u003cimg width=\"1645\" alt=\"image\" src=\"https://github.com/user-attachments/assets/a7257462-80d3-40af-a4ce-3550508fabe7\"\u003e\n\n\n- **DiT Based (Fast-Forward Caching)**\n\u003cimg width=\"1119\" alt=\"image\" src=\"https://github.com/user-attachments/assets/fad8f187-d4ac-4290-9943-7b34116fed05\"\u003e\n\n|Date|Title|Paper|Code|Recom|\n|:---:|:---:|:---:|:---:|:---:|   \n|2023.05|🔥🔥[**Cache-Enabled Sparse Diffusion**] Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference(@pku.edu.cn etc)|[[pdf]](https://arxiv.org/pdf/2305.17423) |⚠️|⭐️⭐️ | \n|2023.12|🔥🔥[**DeepCache**] DeepCache: Accelerating Diffusion Models for Free(@nus.edu)|[[pdf]](https://arxiv.org/pdf/2312.00858) | [[DeepCache]](https://github.com/horseee/DeepCache) ![](https://img.shields.io/github/stars/horseee/DeepCache.svg?style=social)| ⭐️⭐️ |   \n|2023.12|🔥🔥[**Block Caching**] Cache Me if You Can: Accelerating Diffusion Models through Block Caching(@Meta GenAI etc)|[[pdf]](https://arxiv.org/pdf/2312.03209) |⚠️|⭐️⭐️ | \n|2023.12|🔥🔥[**Approximate Caching**] Approximate Caching for Efficiently Serving Diffusion Models(@Adobe)|[[pdf]](https://arxiv.org/pdf/2312.04429) |⚠️|⭐️⭐️ |\n|2024.06| 🔥🔥[**Layer Caching**] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching(@nus.edu)  | [[pdf]](https://arxiv.org/pdf/2406.01733) | [[learning-to-cache]](https://github.com/horseee/learning-to-cache/) ![](https://img.shields.io/github/stars/horseee/learning-to-cache.svg?style=social)| ⭐️⭐️ | \n|2024.07|🔥[**ElasticCache-LVLM**] Efficient Inference of Vision Instruction-Following Models with Elastic Cache(@Tsinghua University etc)|[[pdf]](https://arxiv.org/pdf/2407.18121)|[[ElasticCache]](https://github.com/liuzuyan/ElasticCache) ![](https://img.shields.io/github/stars/liuzuyan/ElasticCache.svg?style=social)|⭐️ |\n|2024.07| 🔥🔥[**Fast-Forward Caching(DiT)**] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration(@microsoft.com etc) | [[pdf]](https://arxiv.org/pdf/2407.01425) | [[FORA]](https://github.com/prathebaselva/FORA) ![](https://img.shields.io/github/stars/prathebaselva/FORA.svg?style=social)|⭐️⭐️ |\n|2024.07| 🔥🔥[**Faster I2V Generation**] Faster Image2Video Generation: A Closer Look at CLIP Image Embedding’s Impact on Spatio-Temporal Cross-Attentions(@Ashkan Taghipour etc)|[[pdf]](https://arxiv.org/pdf/2407.19205) |⚠️|⭐️⭐️ |\n|2024.04| 🔥🔥[**T-GATE V1**] Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models(@Wentian Zhang etc)|[[pdf]](https://arxiv.org/pdf/2404.02747v1) | [[T-GATE]](https://github.com/HaozheLiu-ST/T-GATE) ![](https://img.shields.io/github/stars/HaozheLiu-ST/T-GATE.svg?style=social)|⭐️⭐️ |\n|2024.04| 🔥🔥[**T-GATE V2**] Faster Diffusion via Temporal Attention Decomposition(@Haozhe Liu etc)|[[pdf]](https://arxiv.org/pdf/2404.02747v2) | [[T-GATE]](https://github.com/HaozheLiu-ST/T-GATE) ![](https://img.shields.io/github/stars/HaozheLiu-ST/T-GATE.svg?style=social)|⭐️⭐️ |\n|2024.06| 🔥🔥[**DiTFastAttn**] DiTFastAttn: Attention Compression for Diffusion Transformer Models(@Zhihang Yuan etc)|[[pdf]](https://arxiv.org/pdf/2406.08552) | [[DiTFastAttn]](https://github.com/thu-nics/DiTFastAttn) ![](https://img.shields.io/github/stars/thu-nics/DiTFastAttn.svg?style=social)|⭐️⭐️ |\n|2024.06| 🔥🔥[**∆-DiT**] ∆-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers(@Fudan University)|[[pdf]](https://arxiv.org/pdf/2406.01125) | ⚠️|⭐️⭐️ |  \n|2024.09| 🔥🔥[**TokenCache**] Token Caching for Diffusion Transformer Acceleration(@Institute of Automation, Chinese Academy of Sciences)|[[pdf]](https://arxiv.org/pdf/2409.18523) | ⚠️|⭐️⭐️ |  \n|2024.11| 🔥🔥[**AdaCache**] Adaptive Caching for Faster Video Generation with Diffusion Transformers(@Meta)|[[pdf]](https://adacache-dit.github.io/clarity/adacache_meta.pdf) | [[AdaCache]](https://github.com/AdaCache-DiT/AdaCache) ![](https://img.shields.io/github/stars/AdaCache-DiT/AdaCache.svg?style=social)|⭐️⭐️ |\n|2024.11| 🔥🔥[**TeaCache**] Timestep Embedding Tells: It’s Time to Cache for Video Diffusion Model(@Alibaba)| [[pdf]](https://arxiv.org/pdf/2411.19108) | [[TeaCache]](https://github.com/LiewFeng/TeaCache) ![](https://img.shields.io/github/stars/LiewFeng/TeaCache.svg?style=social)|⭐️⭐️ |\n|2024.11| 🔥🔥[**LazyDiT**] LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers(@Adobe Research)|[[pdf]](https://arxiv.org/pdf/2412.12444)| ⚠️|⭐️⭐️ |  \n|2024.11| 🔥🔥[**Ca2-VDM**] Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing(@ZJU)|[[pdf]](https://arxiv.org/pdf/2411.16375) | [[CausalCache-VDM]](https://github.com/Dawn-LX/CausalCache-VDM/) ![](https://img.shields.io/github/stars/Dawn-LX/CausalCache-VDM.svg?style=social)|⭐️⭐️ |\n|2024.11| 🔥🔥[**SmoothCache**] SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers(@Roblox) | [[pdf]](https://arxiv.org/pdf/2411.10510) | [[SmoothCache]](https://github.com/Roblox/SmoothCache) ![](https://img.shields.io/github/stars/Roblox/SmoothCache.svg?style=social)|⭐️⭐️ |\n|2024.10| 🔥🔥[**FasterCache**] FASTERCACHE: TRAINING-FREE VIDEO DIFFUSION MODEL ACCELERATION WITH HIGH QUALITY(@S-Lab)|[[pdf]](https://arxiv.org/pdf/2410.19355) | [[FasterCache]](https://github.com/Vchitect/FasterCache) ![](https://img.shields.io/github/stars/Vchitect/FasterCache.svg?style=social)|⭐️⭐️ |\n|2024.10| 🔥🔥[**ToCa**] ToCa: Accelerating Diffusion Transformers with Token-wise Feature Caching(@SJTU)|[[pdf]](https://arxiv.org/pdf/2410.05317) | [[ToCa]](https://github.com/Shenyi-Z/ToCa) ![](https://img.shields.io/github/stars/Shenyi-Z/ToCa.svg?style=social)|⭐️⭐️ |\n|2024.11| 🔥🔥[**SkipCache**] Accelerating Vision Diffusion Transformers with Skip Branches(@SJTU)|[[pdf]](https://arxiv.org/pdf/2411.17616) | [[Skip-DiT]](https://github.com/OpenSparseLLMs/Skip-DiT) ![](https://img.shields.io/github/stars/OpenSparseLLMs/Skip-DiT.svg?style=social)|⭐️⭐️ |\n|2024.12| 🔥🔥[**DuCa**] Accelerating Diffusion Transformers with Dual Feature Caching(@SJTU)|[[pdf]](https://arxiv.org/pdf/2412.18911) | [[DuCa]](https://github.com/Shenyi-Z/DuCa) ![](https://img.shields.io/github/stars/Shenyi-Z/DuCa.svg?style=social)|⭐️⭐️ |\n|2025.01| 🔥🔥[**FBCache**] Fastest HunyuanVideo Inference with Context Parallelism and First Block Cache on NVIDIA L20 GPUs(@chengzeyi)| [[docs]](https://github.com/chengzeyi/ParaAttention/blob/main/doc/fastest_hunyuan_video.md) | [[ParaAttention]](https://github.com/chengzeyi/ParaAttention) ![](https://img.shields.io/github/stars/chengzeyi/ParaAttention.svg?style=social)|⭐️⭐️ |\n|2025.01| 🔥🔥[**FlexCache**] FlexCache: Flexible Approximate Cache System for Video Diffusion(@University of Waterloo)| [[pdf]](https://arxiv.org/pdf/2501.04012) | ⚠️|⭐️⭐️ |  \n|2025.01| 🔥🔥[**Token Pruning**] Token Pruning for Caching Better: 9× Acceleration on Stable Diffusion for Free(@SJTU) | [[pdf]](https://arxiv.org/pdf/2501.00375) | [[DaTo]](https://github.com/EvelynZhang-epiclab/DaTo) ![](https://img.shields.io/github/stars/EvelynZhang-epiclab/DaTo.svg?style=social)|⭐️⭐️ |\n|2025.04| 🔥🔥[**AB-Cache**] AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse(@USTC) |  [[pdf]](https://arxiv.org/pdf/2504.10540) | ⚠️|⭐️⭐️ |  \n|2025.03| 🔥🔥[**DiTFastAttnV2**] DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers(@Infinigence AI)|[[pdf]](https://arxiv.org/pdf/2503.22796) | [[DiTFastAttn]](https://github.com/thu-nics/DiTFastAttn) ![](https://img.shields.io/github/stars/thu-nics/DiTFastAttn.svg?style=social)|⭐️⭐️ |\n|2025.03| 🔥🔥[**TaylorSeers**] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers(@SJTU)|[[pdf]](https://arxiv.org/pdf/2503.06923)| [[TaylorSeer]](https://github.com/Shenyi-Z/TaylorSeer) ![](https://img.shields.io/github/stars/Shenyi-Z/TaylorSeer.svg?style=social)|⭐️⭐️ |\n|2025.04| 🔥🔥[**Increment-Calibrated Cache**] Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition(@PKU)|[[pdf]](https://arxiv.org/pdf/2505.05829) | [[icc]](https://github.com/ccccczzy/icc) ![](https://img.shields.io/github/stars/ccccczzy/icc.svg?style=social)|⭐️⭐️ |\n|2025.05| 🔥🔥[**FastCache**] FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation(@yale)| [[pdf]](https://arxiv.org/pdf/2505.20353) | [[FastCache-xDiT]](https://github.com/NoakLiu/FastCache-xDiT) ![](https://img.shields.io/github/stars/NoakLiu/FastCache-xDiT.svg?style=social)|⭐️⭐️ |\n|2025.06| 🔥🔥[**DBCache**] DBCache: Dual Block Caching for Diffusion Transformers(@DefTruth, @vipshop, etc) | [[docs]](https://github.com/vipshop/cache-dit) | [[cache-dit]](https://github.com/vipshop/cache-dit) ![](https://img.shields.io/github/stars/vipshop/cache-dit.svg?style=social)|⭐️⭐️ |\n|2025.06| 🔥🔥[**DBPrune**] DBPrune: Dynamic Block Prune with Residual Caching(@DefTruth, @vipshop, etc) | [[docs]](https://github.com/vipshop/cache-dit) | [[cache-dit]](https://github.com/vipshop/cache-dit) ![](https://img.shields.io/github/stars/vipshop/cache-dit.svg?style=social)|⭐️⭐️ |\n|2025.06| 🔥🔥[**BACache**] Block-wise Adaptive Caching for Accelerating Diffusion Policy(@THU) | [[pdf]](https://arxiv.org/pdf/2506.13456) | ⚠️|⭐️⭐️ |  \n\n## 📙 Parallelism\n\n\u003cdiv id=\"Parallelism\"\u003e\u003c/div\u003e  \n\n- **UNet Based: Displaced Patch parallelism (DistriFusion)**\n\n\u003cimg width=\"1677\" alt=\"image\" src=\"https://github.com/user-attachments/assets/aefb2ae7-73eb-4e9c-bf1a-ec540f4dfa7d\"\u003e\n\n- **DiT Based: Displaced Patch parallelism (PipeFusion)**\n\n\u003cimg width=\"1346\" alt=\"image\" src=\"https://github.com/user-attachments/assets/692c5d54-19b3-4ce7-9613-9eb8bb035c7d\"\u003e\n\n\n|Date|Title|Paper|Code|Recom|\n|:---:|:---:|:---:|:---:|:---:|   \n|2024.02|🔥🔥[**DistriFusion**] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models(@MIT etc)|[[pdf]](https://arxiv.org/abs/2402.19481) | [[distrifuser]](https://github.com/mit-han-lab/distrifuser) ![](https://img.shields.io/github/stars/mit-han-lab/distrifuser.svg?style=social)| ⭐️⭐️ |   \n|2024.05|🔥🔥[**PipeFusion**] PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models(@Tencent etc)|[[pdf]](https://arxiv.org/pdf/2405.14430) | [[xDiT]](https://github.com/xdit-project/xDiT) ![](https://img.shields.io/github/stars/xdit-project/xDiT.svg?style=social)| ⭐️⭐️ | \n|2024.06| 🔥🔥[**AsyncDiff**] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising(@nus.edu) | [[pdf]](https://arxiv.org/pdf/2406.06911) | [[AsyncDiff]](https://github.com/czg1225/AsyncDiff) ![](https://img.shields.io/github/stars/czg1225/AsyncDiff.svg?style=social)| ⭐️⭐️ |   \n|2024.05 | 🔥🔥[**TensorRT-LLM SDXL**] SDXL Distributed Inference with TensorRT-LLM and synchronous comm(@Zars19) | [[pdf]](https://arxiv.org/abs/2402.19481) | [[SDXL-TensorRT-LLM]](https://github.com/NVIDIA/TensorRT-LLM/pull/1514) ![](https://img.shields.io/github/stars/NVIDIA/TensorRT-LLM.svg?style=social)| ⭐️⭐️ | \n|2024.06| 🔥🔥[**Clip Parallelism**] Video-Infinity: Distributed Long Video Generation(@nus.edu)|[[pdf]](https://arxiv.org/pdf/2406.16260) | [[Video-Infinity]](https://github.com/Yuanshi9815/Video-Infinity) ![](https://img.shields.io/github/stars/Yuanshi9815/Video-Infinity.svg?style=social)|⭐️⭐️ | \n|2024.05| 🔥🔥[**FIFO-Diffusion**] FIFO-Diffusion: Generating Infinite Videos from Text without Training(@Seoul National University)|[[pdf]](https://arxiv.org/pdf/2405.11473) |  [[FIFO-Diffusion]](https://github.com/jjihwan/FIFO-Diffusion_public) ![](https://img.shields.io/github/stars/jjihwan/FIFO-Diffusion_public.svg?style=social) |⭐️⭐️ | \n|2025.01| 🔥🔥[**ParaAttention**] Context parallel attention that accelerates DiT model inference with dynamic caching(@chengzeyi)| [[docs]](https://github.com/chengzeyi/ParaAttention) | [[ParaAttention]](https://github.com/chengzeyi/ParaAttention) ![](https://img.shields.io/github/stars/chengzeyi/ParaAttention.svg?style=social)|⭐️⭐️ |\n|2026.01| 🔥🔥[**Cache-DiT**] 🤗A PyTorch-native Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.| [[docs]](https://github.com/vipshop/cache-dit) | [[cache-dit]](https://github.com/vipshop/cache-dit) ![](https://img.shields.io/github/stars/vipshop/cache-dit.svg?style=social)|⭐️⭐️ |\n\n## 📙 Quantization\n\u003cdiv id=\"Quantization\"\u003e\u003c/div\u003e  \n\n|Date|Title|Paper|Code|Recom|\n|:---:|:---:|:---:|:---:|:---:|   \n|2024.08| 🔥[**Transfusion**] Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model(@meta)|[[pdf]](https://www.arxiv.org/pdf/2408.11039) | [[transfusion-pytorch]](https://github.com/A-suozhang/MixDQ) ![](https://img.shields.io/github/stars/lucidrains/transfusion-pytorch.svg?style=social)|⭐️⭐️ |\n|2024.08| 🔥[**MixDQ**] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization (@THU\u0026Infinigence AI.)|[[pdf]](https://arxiv.org/abs/2405.17873) | [[mixdq]](https://github.com/thu-nics/MixDQ) ![](https://img.shields.io/github/stars/thu-nics/MixDQ.svg?style=social)|⭐️⭐️|\n|2024.08| 🔥[**ViDiT-Q**] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation (@THU\u0026Infinigence AI.)|[[pdf]](https://arxiv.org/abs/2406.02540) | [[viditq]](https://github.com/thu-nics/ViDiT-Q) ![](https://img.shields.io/github/stars/thu-nics/ViDiT-Q?style=social)|⭐️⭐️|\n|2024.08| 🔥[**VQ4DiT**] VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers(@ZJU)|[[pdf]](https://arxiv.org/pdf/2408.17131)  |⚠️|⭐️⭐️ |\n|2024.08| 🔥[**LBQ**] Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models(@toronto.edu)|[[pdf]](https://arxiv.org/pdf/2408.06995)  |⚠️|⭐️⭐️ |\n|2024.08| 🔥[**EE-Diffusion**] A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models(@KAIST AI)|[[pdf]](https://arxiv.org/pdf/2408.05927) | [[ee-diffusion]](https://github.com/taehong-moon/ee-diffusion) ![](https://img.shields.io/github/stars/taehong-moon/ee-diffusion.svg?style=social)|⭐️⭐️ |\n|2024.08| 🔥[**TFM-PTQ**] Temporal Feature Matters: A Framework for Diffusion Model Quantization(@SenseTime)|[[pdf]](https://arxiv.org/pdf/2407.19547)  |⚠️|⭐️⭐️ |\n|2024.08| 🔥[**Diffusion-RWKV**] Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models(@Zhengcong Fei)|[[pdf]](https://arxiv.org/pdf/2404.04478) | [[Diffusion-RWKV]](https://github.com/feizc/Diffusion-RWKV) ![](https://img.shields.io/github/stars/feizc/Diffusion-RWKV.svg?style=social)|⭐️⭐️ |\n|2024.09| 🔥[**LinFusion**] LINFUSION: 1 GPU, 1 MINUTE, 16K IMAGE(@NUS)|[[pdf]](https://arxiv.org/pdf/2409.02097) | [[LinFusion]](https://github.com/Huage001/LinFusion) ![](https://img.shields.io/github/stars/Huage001/LinFusion.svg?style=social)|⭐️⭐️ |\n|2024.11| 🔥🔥[**SVDQuant**] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models|[[pdf]](https://arxiv.org/pdf/2411.05007) | [[nunchaku]](https://github.com/mit-han-lab/nunchaku) ![](https://img.shields.io/github/stars/mit-han-lab/nunchaku.svg?style=social)|⭐️⭐️ |\n\n## 📙 Attention\n\u003cdiv id=\"Attention\"\u003e\u003c/div\u003e  \n\n|Date|Title|Paper|Code|Recom|\n|:---:|:---:|:---:|:---:|:---:| \n|2024.10|🔥🔥[**SageAttention**] SAGEATTENTION: ACCURATE 8-BIT ATTENTION FOR PLUG-AND-PLAY INFERENCE ACCELERATION(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2410.02367)|[[SageAttention]](https://github.com/thu-ml/SageAttention) ![](https://img.shields.io/github/stars/thu-ml/SageAttention) | ⭐️⭐️ |\n|2024.11|🔥🔥[**SageAttention-2**] SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2411.10958)|[[SageAttention]](https://github.com/thu-ml/SageAttention) ![](https://img.shields.io/github/stars/thu-ml/SageAttention) | ⭐️⭐️ |\n|2025.03|🔥🔥[**SpargeAttention**] SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2502.18137)|[[SpargeAttn]](https://github.com/thu-ml/SpargeAttn) ![](https://img.shields.io/github/stars/thu-ml/SpargeAttn) | ⭐️⭐️ |\n|2025.05|🔥🔥[**SageAttention-3**] SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-bit Training(@thu-ml)|[[pdf]](https://arxiv.org/pdf/2505.11594)|[[SageAttention]](https://github.com/thu-ml/SageAttention) ![](https://img.shields.io/github/stars/thu-ml/SageAttention) | ⭐️⭐️ |\n|2025.05| 🔥🔥[**DraftAttention**] DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance(@Northeastern University) | [[pdf]](https://arxiv.org/pdf/2505.14708)|[[draft-attention]](https://github.com/shawnricecake/draft-attention) ![](https://img.shields.io/github/stars/shawnricecake/draft-attention) | ⭐️⭐️ |\n\n\n## ©️License  \n\nGNU General Public License v3.0  \n\n## 🎉Contribute  \n\nWelcome to star \u0026 submit a PR to this repo! \n\n\u003cdiv align='center'\u003e\n\u003ca href=\"https://star-history.com/#xlite-dev/Awesome-DiT-Inference\u0026Date\"\u003e\n  \u003cpicture align='center'\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=xlite-dev/Awesome-DiT-Inference\u0026type=Date\u0026theme=dark\" /\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=xlite-dev/Awesome-DiT-Inference\u0026type=Date\" /\u003e\n    \u003cimg width=\"350\" height=\"250\" alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=xlite-dev/Awesome-DiT-Inference\u0026type=Date\" /\u003e\n  \u003c/picture\u003e\n\u003c/a\u003e\n\u003c/div\u003e\n","projects_url":"https://awesome.ecosyste.ms/api/v1/lists/xlite-dev%2Fawesome-dit-inference/projects"}