{"id":13737466,"url":"https://github.com/yandex-research/ddpm-segmentation","last_synced_at":"2026-04-11T03:57:48.100Z","repository":{"id":41166615,"uuid":"434793920","full_name":"yandex-research/ddpm-segmentation","owner":"yandex-research","description":"Label-Efficient Semantic Segmentation with Diffusion Models (ICLR'2022)","archived":false,"fork":false,"pushed_at":"2023-04-08T10:18:07.000Z","size":73,"stargazers_count":648,"open_issues_count":8,"forks_count":58,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-08-04T03:09:17.649Z","etag":null,"topics":["deep-learning","semantic-segmentation"],"latest_commit_sha":null,"homepage":"https://yandex-research.github.io/ddpm-segmentation/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yandex-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-12-04T03:08:19.000Z","updated_at":"2024-08-02T04:45:04.000Z","dependencies_parsed_at":"2024-01-12T03:35:29.209Z","dependency_job_id":"f4128d55-bfeb-4f24-bfd2-abada279b99b","html_url":"https://github.com/yandex-research/ddpm-segmentation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex-research%2Fddpm-segmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex-research%2Fddpm-segmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex-research%2Fddpm-segmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex-research%2Fddpm-segmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yandex-research","download_url":"https://codeload.github.com/yandex-research/ddpm-segmentation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224737407,"owners_count":17361345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","semantic-segmentation"],"created_at":"2024-08-03T03:01:48.946Z","updated_at":"2026-04-11T03:57:43.052Z","avatar_url":"https://github.com/yandex-research.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Label-Efficient Semantic Segmentation with Diffusion Models\n\n**ICLR'2022** [[Project page]](https://yandex-research.github.io/ddpm-segmentation/)\n\nOfficial implementation of the paper [Label-Efficient Semantic Segmentation with Diffusion Models](https://arxiv.org/pdf/2112.03126.pdf)\n\nThis code is based on [datasetGAN](https://github.com/nv-tlabs/datasetGAN_release) and [guided-diffusion](https://github.com/openai/guided-diffusion). \n\n**Note:** use **--recurse-submodules** when clone.\n\n\u0026nbsp;\n## Overview\n\nThe paper investigates the representations learned by the state-of-the-art DDPMs and shows that they capture high-level semantic information valuable for downstream vision tasks. We design a simple semantic segmentation approach that exploits these representations and outperforms the alternatives in the few-shot operating point.\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg width=\"100%\" alt=\"DDPM-based Segmentation\" src=\"https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/figs/new_ddpm_seg_scheme.png\"\u003e\n\u003c/div\u003e\n\n\u0026nbsp;\n## Updates\n\n**3/9/2022:** \n\n1) Improved performance of DDPM-based segmentation by changing:\\\n   \u0026nbsp;\u0026nbsp;Diffusion steps: [50,150,250,350] --\u003e [50,150,250];\\\n   \u0026nbsp;\u0026nbsp;UNet blocks: [6,7,8,9] --\u003e [5,6,7,8,12];\n3) Trained a bit better DDPM on FFHQ-256;\n4) Added [MAE](https://github.com/facebookresearch/mae) for comparison.\n\n\u0026nbsp;\n## Datasets\n\nThe evaluation is performed on 6 collected datasets with a few annotated images in the training set:\nBedroom-18, FFHQ-34, Cat-15, Horse-21, CelebA-19 and ADE-Bedroom-30. The number corresponds to the number of semantic classes.\n\n[datasets.tar.gz](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/datasets.tar.gz) (~47Mb)\n\n\n\u0026nbsp;\n## DDPM\n\n### Pretrained DDPMs\n\nThe models trained on LSUN are adopted from [guided-diffusion](https://github.com/openai/guided-diffusion).\nFFHQ-256 is trained by ourselves using the same model parameters as for the LSUN models.\n\n*LSUN-Bedroom:* [lsun_bedroom.pt](https://openaipublic.blob.core.windows.net/diffusion/jul-2021/lsun_bedroom.pt)\\\n*FFHQ-256:* [ffhq.pt](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/ddpm_checkpoints/ffhq.pt) (Updated 3/8/2022)\\\n*LSUN-Cat:* [lsun_cat.pt](https://openaipublic.blob.core.windows.net/diffusion/jul-2021/lsun_cat.pt)\\\n*LSUN-Horse:* [lsun_horse.pt](https://openaipublic.blob.core.windows.net/diffusion/jul-2021/lsun_horse.pt)\n\n### Run \n\n1. Download the datasets:\\\n \u0026nbsp;\u0026nbsp;```bash datasets/download_datasets.sh```\n2. Download the DDPM checkpoint:\\\n \u0026nbsp;\u0026nbsp; ```bash checkpoints/ddpm/download_checkpoint.sh \u003ccheckpoint_name\u003e```\n3. Check paths in ```experiments/\u003cdataset_name\u003e/ddpm.json``` \n4. Run: ```bash scripts/ddpm/train_interpreter.sh \u003cdataset_name\u003e```\n   \n**Available checkpoint names:** lsun_bedroom, ffhq, lsun_cat, lsun_horse\\\n**Available dataset names:** bedroom_28, ffhq_34, cat_15, horse_21, celeba_19, ade_bedroom_30\n\n**Note:** ```train_interpreter.sh``` is RAM consuming since it keeps all training pixel representations in memory. For ex, it requires ~210Gb for 50 training images of 256x256. (See [issue](https://github.com/nv-tlabs/datasetGAN_release/issues/34))\n\n**Pretrained pixel classifiers** and test predictions are [here](https://www.dropbox.com/s/kap229jvmhfwh7i/pixel_classifiers.tar?dl=0).\n\n### How to improve the performance\n\n* Tune for a particular task what diffusion steps and UNet blocks to use.\n\n\n\u0026nbsp;\n## DatasetDDPM\n\n\n### Synthetic datasets\n\nTo download DDPM-produced synthetic datasets (50000 samples, ~7Gb) (updated 3/8/2022):\\\n```bash synthetic-datasets/ddpm/download_synthetic_dataset.sh \u003cdataset_name\u003e```\n\n### Run | Option #1\n\n1. Download the synthetic dataset:\\\n\u0026nbsp;\u0026nbsp; ```bash synthetic-datasets/ddpm/download_synthetic_dataset.sh \u003cdataset_name\u003e```\n2. Check paths in ```experiments/\u003cdataset_name\u003e/datasetDDPM.json``` \n3. Run: ```bash scripts/datasetDDPM/train_deeplab.sh \u003cdataset_name\u003e``` \n\n### Run | Option #2\n\n1. Download the datasets:\\\n \u0026nbsp;\u0026nbsp; ```bash datasets/download_datasets.sh```\n2. Download the DDPM checkpoint:\\\n \u0026nbsp;\u0026nbsp; ```bash checkpoints/ddpm/download_checkpoint.sh \u003ccheckpoint_name\u003e```\n3. Check paths in ```experiments/\u003cdataset_name\u003e/datasetDDPM.json```\n4. Train an interpreter on a few DDPM-produced annotated samples:\\\n   \u0026nbsp;\u0026nbsp; ```bash scripts/datasetDDPM/train_interpreter.sh \u003cdataset_name\u003e```\n5. Generate a synthetic dataset:\\\n   \u0026nbsp;\u0026nbsp; ```bash scripts/datasetDDPM/generate_dataset.sh \u003cdataset_name\u003e```\\\n   \u0026nbsp;\u0026nbsp;\u0026nbsp; Please specify the hyperparameters in this script for the available resources.\\\n   \u0026nbsp;\u0026nbsp;\u0026nbsp; On 8xA100 80Gb, it takes about 12 hours to generate 10000 samples.   \n\n5. Run: ```bash scripts/datasetDDPM/train_deeplab.sh \u003cdataset_name\u003e```\\\n   \u0026nbsp;\u0026nbsp; One needs to specify the path to the generated data. See comments in the script.\n\n**Available checkpoint names:** lsun_bedroom, ffhq, lsun_cat, lsun_horse\\\n**Available dataset names:** bedroom_28, ffhq_34, cat_15, horse_21\n\n\u0026nbsp;\n## MAE\n\n### Pretrained MAEs\n\nWe pretrain MAE models using the [official implementation](https://github.com/facebookresearch/mae) on the LSUN and FFHQ-256 datasets:\n\n*LSUN-Bedroom:* [lsun_bedroom.pth](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/mae_checkpoints/lsun_bedroom.pth)\\\n*FFHQ-256:* [ffhq.pth](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/mae_checkpoints/ffhq.pth)\\\n*LSUN-Cat:* [lsun_cat.pth](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/mae_checkpoints/lsun_cat.pth)\\\n*LSUN-Horse:* [lsun_horse.pth](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/mae_checkpoints/lsun_horse.pth)\n\n**Training setups**: \n\n| Dataset | Backbone | epochs | batch-size | mask-ratio |\n|-------------------|-------------------|---------------------|--------------------|--------------------|\n| LSUN Bedroom | ViT-L-8 | 150 | 1024 | 0.75 |\n| LSUN Cat | ViT-L-8 | 200 | 1024 | 0.75 |\n| LSUN Horse | ViT-L-8 | 200 | 1024 | 0.75 |\n| FFHQ-256 | ViT-L-8 | 400 | 1024 | 0.75 |\n\n### Run \n\n1. Download the datasets:\\\n \u0026nbsp;\u0026nbsp; ```bash datasets/download_datasets.sh```\n2. Download the MAE checkpoint:\\\n \u0026nbsp;\u0026nbsp; ```bash checkpoints/mae/download_checkpoint.sh \u003ccheckpoint_name\u003e```\n3. Check paths in ```experiments/\u003cdataset_name\u003e/mae.json``` \n4. Run: ```bash scripts/mae/train_interpreter.sh \u003cdataset_name\u003e```\n   \n**Available checkpoint names:** lsun_bedroom, ffhq, lsun_cat, lsun_horse\\\n**Available dataset names:** bedroom_28, ffhq_34, cat_15, horse_21, celeba_19, ade_bedroom_30\n\n\u0026nbsp;\n## SwAV\n\n### Pretrained SwAVs\n\nWe pretrain SwAV models using the [official implementation](https://github.com/facebookresearch/swav) on the LSUN and FFHQ-256 datasets:\n\n| LSUN-Bedroom | FFHQ-256 | LSUN-Cat | LSUN-Horse |\n|-------------------|-------------------|---------------------|--------------------|\n| [SwAV](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_checkpoints/lsun_bedroom.pth) | [SwAV](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_checkpoints/ffhq.pth) | [SwAV](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_checkpoints/lsun_cat.pth) | [SwAV](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_checkpoints/lsun_horse.pth) | \n| [SwAVw2](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_w2_checkpoints/lsun_bedroom.pth) | [SwAVw2](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_w2_checkpoints/ffhq.pth) | [SwAVw2](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_w2_checkpoints/lsun_cat.pth) | [SwAVw2](https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/models/swav_w2_checkpoints/lsun_horse.pth) | \n\n**Training setups**: \n\n| Dataset | Backbone | epochs | batch-size | multi-crop | num-prototypes |\n|-------------------|-------------------|---------------------|--------------------|--------------------|--------------------|\n| LSUN | RN50 | 200 | 1792 | 2x256 + 6x108 | 1000 |\n| FFHQ-256 | RN50 | 400 | 2048 | 2x224 + 6x96 | 200 |\n| LSUN | RN50w2 | 200 | 1920 | 2x256 + 4x108 | 1000 |\n| FFHQ-256 | RN50w2 | 400 | 2048 | 2x224 + 4x96 | 200 |\n\n### Run \n\n1. Download the datasets:\\\n \u0026nbsp;\u0026nbsp; ```bash datasets/download_datasets.sh```\n2. Download the SwAV checkpoint:\\\n \u0026nbsp;\u0026nbsp; ```bash checkpoints/{swav|swav_w2}/download_checkpoint.sh \u003ccheckpoint_name\u003e```\n3. Check paths in ```experiments/\u003cdataset_name\u003e/{swav|swav_w2}.json``` \n4. Run: ```bash scripts/{swav|swav_w2}/train_interpreter.sh \u003cdataset_name\u003e```\n   \n**Available checkpoint names:** lsun_bedroom, ffhq, lsun_cat, lsun_horse\\\n**Available dataset names:** bedroom_28, ffhq_34, cat_15, horse_21, celeba_19, ade_bedroom_30\n\n\n\u0026nbsp;\n## DatasetGAN\n\nOpposed to the [official implementation](https://github.com/nv-tlabs/datasetGAN_release), more recent StyleGAN2(-ADA) models are used.\n\n### Synthetic datasets \n\nTo download GAN-produced synthetic datasets (50000 samples): \n\n```bash synthetic-datasets/gan/download_synthetic_dataset.sh \u003cdataset_name\u003e```\n\n### Run\n\nSince we almost fully adopt the [official implementation](https://github.com/nv-tlabs/datasetGAN_release), we don't provide our reimplementation here. \nHowever, one can still reproduce our results:\n\n1. Download the synthetic dataset:\\\n \u0026nbsp;\u0026nbsp;```bash synthetic-datasets/gan/download_synthetic_dataset.sh \u003cdataset_name\u003e```\n2. Change paths in ```experiments/\u003cdataset_name\u003e/datasetDDPM.json``` \n3. Change paths and run: ```bash scripts/datasetDDPM/train_deeplab.sh \u003cdataset_name\u003e```\n\n**Available dataset names:** bedroom_28, ffhq_34, cat_15, horse_21\n\n\n\u0026nbsp;\n## Results\n\n* Performance in terms of mean IoU:\n\n| Method       | Bedroom-28 | FFHQ-34 \t| Cat-15 | Horse-21  | CelebA-19 | ADE-Bedroom-30 |\n|:------------- |:-------------- |:--------------- |:--------------- |:--------------- |:--------------- |:--------------- |\n| ALAE   \t| 20.0 ± 1.0     |  48.1 ± 1.3  \t| -- \t| --          \t| 49.7 ± 0.7 | 15.0 ± 0.5      |\n| VDVAE  \t| --         \t| 57.3 ± 1.1    | -- | --          \t| 54.1 ± 1.0 | --          \t|\n| GAN Inversion  | 13.9 ± 0.6 \t| 51.7 ± 0.8 \t| 21.4 ± 1.7 \t| 17.7 ± 0.4 | 51.5 ± 2.3 | 11.1 ± 0.2 |\n| GAN Encoder  | 22.4 ± 1.6 \t| 53.9 ± 1.3 \t| 32.0 ± 1.8 \t| 26.7 ± 0.7 | 53.9 ± 0.8 | 15.7 ± 0.3 |\n| SwAV      \t | 41.0 ± 2.3 \t| 54.7 ± 1.4 \t| 44.1 ± 2.1 \t| 51.7 ± 0.5 | 53.2 ± 1.0 | 30.3 ± 1.5 | \n| SwAVw2      \t | 42.4 ± 1.7 \t| 56.9 ± 1.3 \t| 45.1 ± 2.1 \t| 54.0 ± 0.9 | 52.4 ± 1.3 | 30.6 ± 1.0 |\n| MAE           | 45.0 ± 2.0  | **58.8 ± 1.1** | **52.4 ± 2.3** | 63.4 ± 1.4 | 57.8 ± 0.4 | 31.7 ± 1.8 |\n| DatasetGAN\t | 31.3 ± 2.7 \t| 57.0 ± 1.0 | 36.5 ± 2.3 \t| 45.4 ± 1.4 | --\t| --  |\n| DatasetDDPM  | 47.9 ± 2.9 |  56.0 ± 0.9    | 47.6 ± 1.5 \t| 60.8 ± 1.0  | --\t| --              |\n| **DDPM**      \t | **49.4 ± 1.9** | **59.1 ± 1.4** | **53.7 ± 3.3** | **65.0 ± 0.8** | **59.9 ± 1.0** | **34.6 ± 1.7** |\n\n\u0026nbsp;\n* Examples of segmentation masks predicted by the DDPM-based method:\n\n\u003cdiv\u003e\n  \u003cimg width=\"100%\" alt=\"DDPM-based Segmentation\" src=\"https://storage.yandexcloud.net/yandex-research/ddpm-segmentation/figs/examples.png\"\u003e\n\u003c/div\u003e\n\n\n\u0026nbsp;\n## Cite\n\n```\n@misc{baranchuk2021labelefficient,\n      title={Label-Efficient Semantic Segmentation with Diffusion Models}, \n      author={Dmitry Baranchuk and Ivan Rubachev and Andrey Voynov and Valentin Khrulkov and Artem Babenko},\n      year={2021},\n      eprint={2112.03126},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyandex-research%2Fddpm-segmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyandex-research%2Fddpm-segmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyandex-research%2Fddpm-segmentation/lists"}