{"id":19382845,"url":"https://github.com/locuslab/ect","last_synced_at":"2025-10-24T06:38:19.507Z","repository":{"id":230082418,"uuid":"778103634","full_name":"locuslab/ect","owner":"locuslab","description":"Consistency Models Made Easy","archived":false,"fork":false,"pushed_at":"2024-10-13T23:51:22.000Z","size":863,"stargazers_count":277,"open_issues_count":7,"forks_count":12,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-04T01:14:00.196Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/locuslab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-27T04:42:04.000Z","updated_at":"2025-04-01T02:34:32.000Z","dependencies_parsed_at":"2024-11-10T09:25:30.501Z","dependency_job_id":null,"html_url":"https://github.com/locuslab/ect","commit_stats":null,"previous_names":["locuslab/ect"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fect","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fect/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fect/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fect/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/locuslab","download_url":"https://codeload.github.com/locuslab/ect/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248633818,"owners_count":21136919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T09:23:31.160Z","updated_at":"2025-10-24T06:38:14.474Z","avatar_url":"https://github.com/locuslab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ECT: Consistency Models Made Easy\n\nPytorch implementation for [Easy Consistency Tuning (ECT)](https://www.notion.so/gsunshine/Consistency-Models-Made-Easy-954205c0b4a24c009f78719f43b419cc).\n\nECT unlocks state-of-the-art (SoTA) few-step generative abilities through a simple yet principled approach. \nWith minimal tuning costs, ECT demonstrates promising early results and scales with training FLOPs and model sizes.\n\nTry your own [Consistency Models](https://arxiv.org/abs/2303.01469)! You only need to fine-tune a bit. :D\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./assets/learning_scheme.jpg\" width=\"1000\" alt=\"Comparison of Learning Schemes\"\u003e\n\u003c/div\u003e\n\n## Introduction\n\nThis repository is organized in a multi-branch structure, with each branch offering a minimal implementation for a specific purpose. \nThe current branches support the following training protocols:\n\n- `main`: ECT on CIFAR-10. Best for understanding CMs and fast prototyping.\n- `amp`: Mixed Precision training with GradScaler on CIFAR-10.\n- `imgnet`: ECT ImageNet 64x64.\n\n## ⭐ Update ⭐\n\nBaking more in the oven. 🙃 \n\n- **2024.10.12** - Add ECT code for ImgNet 64x64. Switch to the `imgnet` [branch](https://github.com/locuslab/ect/tree/imgnet): `git checkout imgnet`.\n- **2024.09.23** - Add Gradscaler for Mixed Precision Training. To use mixed precision with GradScaler, switch to the `amp` [branch](https://github.com/locuslab/ect/tree/amp): `git checkout amp`.\n- **2024.04.27** - Upgrade environment to Pytorch 2.3.0.\n- **2024.04.12** - ECMs can now surpass SoTA GANs using 1 model step and SoTA Diffusion Models using 2 model steps on CIFAR10. Checkpoints available.\n\n## Environment\n\nYou can run the following command to set up the Python environment through `conda`. \nPytorch 2.3.0 and Python 3.9.18 will be installed.\n\n```bash\nconda env create -f env.yml\n```\n\n## Datasets\n\nPrepare the dataset in the EDM's format. See a reference [here](https://github.com/NVlabs/edm?tab=readme-ov-file#preparing-datasets).\n\n## Training\n\nRun the following command to tune your SoTA 2-step ECM and match Consistency Distillation (CD) within 1 A100 GPU hour. \n\n```bash\nbash run_ecm_1hour.sh 1 \u003cPORT\u003e --desc bs128.1hour\n```\n\nRun the following command to run ECT at batch size 128 and 200k iterations. NGPUs=2/4 is recommended. \n\n```bash\nbash run_ecm.sh \u003cNGPUs\u003e \u003cPORT\u003e --desc bs128.200k\n```\n\nReplace NGPUs and PORT with the number of GPUs used for training and the port number for DDP sync.\n\n### Half Precision Training\n\nIn this branch, we enable fp16 training with AMP GradScaler for more stable training dynamics.\nTo enable fp16 and GradScaler, add the following arguments to your script:\n\n```bash\nbash run_ecm_1hour.sh 1 \u003cPORT\u003e --desc bs128.1hour --fp16=True --enable_amp=True\n```\n\nFor more information, please refer to this [PR](https://github.com/locuslab/ect/pull/13). \nFull support for Automatic Mixed Precision (AMP) will be added later.\n\n## Evaluation\n\nRun the following command to calculate FID of a pretrained checkpoint. \n\n```bash\nbash eval_ecm.sh \u003cNGPUs\u003e \u003cPORT\u003e --resume \u003cCKPT_PATH\u003e \n```\n\n## Generative Performance\n\n### FID Evaluation\n\n\nTaking the models trained by ECT as ECM, we compare ECMs' unconditional image generation capabilities with SoTA generative models on the CIFAR10 dataset, including popular diffusion models w/ advanced samplers, diffusion distillations, and consistency models on the CIFAR10 dataset.\n\n| Method | FID | NFE | Model  | Params | Batch Size | Schedule |\n| :----  | :-- | :-- |:----   | :----- | :--------- | :------- |\n| Score SDE | 2.38 | 2000 | NCSN++ | 56.4M | 128 | ~1600k | \n| Score SDE-deep | 2.20 | 2000 | NCSN++ (2 $\\times$ depth) | \u003e 100M | 128 | ~1600k |\n| EDM                | 2.01 | 35 | DDPM++ | 56.4M | 512 | 400k |\n| PD                 | 8.34 | 1  | DDPM++ | 56.4M | 512 | 800k | \n| Diff-Instruct      | 4.53 | 1  | DDPM++ | 56.4M | 512 | 800k | \n| CD (LPIPS)         | 3.55 | 1  | NCSN++ | 56.4M | 512 | 800k | \n| CD (LPIPS)         | 2.93 | 2  | NCSN++ | 56.4M | 512 | 800k | \n| iCT-deep           | 2.51 | 1  | NCSN++ (2 $\\times$ depth) | \u003e 100M | 1024 | 400k | \n| iCT-deep           | 2.24 | 2  | NCSN++ (2 $\\times$ depth) | \u003e 100M | 1024 | 400k | \n| ECM (100k)         | 4.54 | 1  | DDPM++ | 55.7M | 128 | 100k |\n| ECM (200k)         | 3.86 | 1  | DDPM++ | 55.7M | 128 | 200k |\n| ECM (400k)         | 3.60 | 1  | DDPM++ | 55.7M | 128 | 400k |\n| ECM (100k)         | 2.20 | 2  | DDPM++ | 55.7M | 128 | 100k | \n| ECM (200k)         | 2.15 | 2  | DDPM++ | 55.7M | 128 | 200k | \n| ECM (400k)         | 2.11 | 2  | DDPM++ | 55.7M | 128 | 400k | \n\n### $\\mathrm{FD}_{\\text{DINOv2}}$ Evaluation\n\nSince DINOv2 could produce evaluation better aligned with human vision, we evaluate the image fidelity using Fréchet Distance in the latent space of SoTA open-source representation model DINOv2, denoted as \n$\\mathrm{FD}_{\\text{DINOv2}}$.\n\n---\n\nUsing [dgm-eval](https://github.com/layer6ai-labs/dgm-eval/tree/master), we have $\\mathrm{FD}_{\\text{DINOv2}}$ against SoTA Diffusion Models and GANs.\n\n| Method |  $\\mathrm{FD}_{\\text{DINOv2}}$  | NFE | \n| :----  |  :-- | :-- |\n| [EDM](https://github.com/NVlabs/edm)                                        | 145.20 | 35  |\n| [StyleGAN-XL](https://github.com/autonomousvision/stylegan-xl/tree/main)    | 204.60 | 1   |\nECM                                                                           | 198.51 | 1   | \nECM                                                                           | 128.63 | 2   |\n\nWithout combining with other generative mechanisms like GANs or diffusion distillation like Score Distillation, ECT is capable of generating high-quality samples much faster than SoTA diffusion models and much better than ~~SoTA GANs~~ SoTA Diffusion Models and GANs.\n\n## Checkpoints\n\n- CIFAR10 $\\mathrm{FD}_{\\text{DINOv2}}$ [checkpoint](https://drive.google.com/file/d/1WN_eLTrcl-vB7fMc1HADpacgcO4SNJ_1/view?usp=sharing).\n\n## Contact\n\nFeel free to drop me an email at zhengyanggeng@gmail.com if you have additional questions or are interested in collaboration. You can find me on [Twitter](https://twitter.com/ZhengyangGeng) or [WeChat](https://github.com/Gsunshine/Enjoy-Hamburger/blob/main/assets/WeChat.jpg).\n\n## Citation\n\n```bibtex\n@article{ect,\n  title={Consistency Models Made Easy},\n  author={Geng, Zhengyang and Pokle, Ashwini and Luo, William and Lin, Justin and Kolter, J Zico},\n  journal={arXiv preprint arXiv:2406.14548},\n  year={2024}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocuslab%2Fect","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flocuslab%2Fect","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocuslab%2Fect/lists"}