{"id":13737965,"url":"https://github.com/tfzhou/ContrastiveSeg","last_synced_at":"2025-05-08T15:32:08.472Z","repository":{"id":37815606,"uuid":"333190855","full_name":"tfzhou/ContrastiveSeg","owner":"tfzhou","description":"ICCV2021 (Oral) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation","archived":false,"fork":false,"pushed_at":"2022-10-13T19:22:33.000Z","size":1814,"stargazers_count":667,"open_issues_count":29,"forks_count":88,"subscribers_count":20,"default_branch":"main","last_synced_at":"2024-11-15T06:32:56.993Z","etag":null,"topics":["cityscapes","contrastive-learning","hard-example-mining","pascal-context","pixel-contrast","semantic-segmentation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tfzhou.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-01-26T19:17:37.000Z","updated_at":"2024-11-14T08:31:34.000Z","dependencies_parsed_at":"2022-07-14T21:46:56.945Z","dependency_job_id":null,"html_url":"https://github.com/tfzhou/ContrastiveSeg","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfzhou%2FContrastiveSeg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfzhou%2FContrastiveSeg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfzhou%2FContrastiveSeg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tfzhou%2FContrastiveSeg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tfzhou","download_url":"https://codeload.github.com/tfzhou/ContrastiveSeg/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253096290,"owners_count":21853571,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cityscapes","contrastive-learning","hard-example-mining","pascal-context","pixel-contrast","semantic-segmentation"],"created_at":"2024-08-03T03:02:07.362Z","updated_at":"2025-05-08T15:32:07.376Z","avatar_url":"https://github.com/tfzhou.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Exploring Cross-Image Pixel Contrast for Semantic Segmentation\n\n![](figures/framework.png)\n\n\u003e [**Exploring Cross-Image Pixel Contrast for Semantic Segmentation**](https://arxiv.org/abs/2101.11939),            \n\u003e [Wenguan Wang](https://sites.google.com/view/wenguanwang/), [Tianfei Zhou](https://www.tfzhou.com/), [Fisher Yu](https://www.yf.io/), [Jifeng Dai](https://jifengdai.org/), [Ender Konukoglu](https://scholar.google.com/citations?user=OeEMrhQAAAAJ\u0026hl=en) and [Luc Van Gool](https://scholar.google.com/citations?user=TwMib_QAAAAJ\u0026hl=en) \u003cbr\u003e\n\u003e *ICCV 2021 (Oral) ([arXiv 2101.11939](https://arxiv.org/abs/2101.11939))*\n\n## News\n\n* [2022-10-13] Our work [GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models](https://github.com/leonnnop/GMMSeg) has been accepted to NeurIPS'22.\n* [2022-03-20] Our work [Rethinking Semantic Segmentation: A Prototype View](https://github.com/tfzhou/ProtoSeg) has been accepted to CVPR'22 as an **Oral paper**. \n* [2021-07-28] ContrastiveSeg has been accepted in ICCV'21 as Oral.\n* [2021-07-28] Update memory code.\n* [2021-07-01] The codebase has been transferred from Pytorch-0.4.1 to Pytorch-1.7.1, which will be easier for usage.\n\n## Abstract\n\nCurrent semantic segmentation methods focus only on\nmining “local” context, i.e., dependencies between pixels\nwithin individual images, by context-aggregation modules\n(e.g., dilated convolution, neural attention) or structureaware optimization criteria (e.g., IoU-like loss). However, they ignore “global” context of the training data, i.e.,\nrich semantic relations between pixels across different images. Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise\ncontrastive framework for semantic segmentation in the\nfully supervised setting. The core idea is to enforce pixel\nembeddings belonging to a same semantic class to be more\nsimilar than embeddings from different classes. It raises a\npixel-wise metric learning paradigm for semantic segmentation, by explicitly exploring the structures of labeled pixels, which are long ignored in the field. Our method can be\neffortlessly incorporated into existing segmentation frameworks without extra overhead during testing.\n\nWe experimentally show that, with famous segmentation models (i.e.,\nDeepLabV3, HRNet, OCR) and backbones (i.e., ResNet, HRNet), our method brings consistent performance improvements across diverse datasets (i.e., Cityscapes, PASCALContext, COCO-Stuff).\n\n## Installation\n\nThis implementation is built on [openseg.pytorch](https://github.com/openseg-group/openseg.pytorch). Many thanks to the authors for the efforts.\n\nPlease follow the [Getting Started](https://github.com/openseg-group/openseg.pytorch/blob/master/GETTING_STARTED.md) for installation and dataset preparation.\n\n## Performance\n\n### Cityscapes Dataset\n\n| Backbone  | Model      | Train Set | Val Set | Iterations | Batch Size | Contrast Loss | Memory | mIoU  | Log | CKPT |Script |\n| --------- | ---------- | --------- | ------- | ---------- | ---------- | ------------- | ------ | ----- | --- | ----   | ----   |\n| ResNet-101| DeepLab-V3 |train     | val     | 40000      | 8          | N             | N      | 72.75 | [log](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/deeplab_v3_deepbase_resnet101_dilated8_deeplab_v3.log) | [ckpt](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/deeplab_v3_deepbase_resnet101_dilated8_deeplab_v3_max_performance.pth) |```scripts/cityscapes/deeplab/run_r_101_d_8_deeplabv3_train.sh```|\n| ResNet-101| DeepLab-V3 |train     | val     | 40000      | 8          | Y             | N      | 77.67 | [log](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/deeplab_v3_contrast_deepbase_resnet101_dilated8_deeplab_v3_contrast.log) | [ckpt](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/deeplab_v3_contrast_deepbase_resnet101_dilated8_deeplab_v3_contrast_max_performance.pth) |```scripts/cityscapes/deeplab/run_r_101_d_8_deeplabv3_contrast_train.sh```|\n| HRNet-W48 | HRNet-W48  |train     | val     | 40000      | 8          | N             | N      | 79.27 | [log](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/hrnet_w48_lr1x_hrnet_ce.log) | [ckpt](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/hrnet_w48_lr1x_hrnet_ce_max_performance.pth) |```scripts/cityscapes/hrnet/run_h_48_d_4.sh```|\n| HRNet-W48 | HRNet-W48  |train     | val     | 40000      | 8          | Y             | N      | 80.18 | [log](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/hrnet_w48_contrast_lr1x_hrnet_contrast_t0.1.log) | [ckpt](https://github.com/tfzhou/pretrained_weights/releases/download/v0.1/hrnet_w48_contrast_lr1x_hrnet_contrast_t0.1_max_performance.pth) |```scripts/cityscapes/hrnet/run_h_48_d_4_contrast.sh```|\n\n_It seems that the DeepLab-V3 baseline does not produce the expected performance on the new codebase. I will tune this later._\n\n\n### Study of the temperature\n| Backbone  | Train Set | Val Set | Iterations | Batch Size | Temperature   | mIoU  |\n| --------- | --------- | ------- | ---------- | ---------- | ------------- | ----- |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.05          | 79.80 |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.07          | 79.59 |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.10          | **80.18** |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.20          | 80.01 |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.30          | 79.27 |\n| HRNet-W48 | train     | val     | 40000      | 8          | 0.40          | 79.40 |\n\n\n## t-SNE Visualization\n\n* Pixel-wise Cross-Entropy Loss\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"figures/tsne1.png\" width=\"400\"\u003e\n\u003c/p\u003e\n\n* Pixel-wise Contrastive Learning Objective \n  \n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"figures/tsne2.png\" width=\"400\"\u003e\n\u003c/p\u003e  \n\n## Citation\n```\n@inproceedings{Wang_2021_ICCV,\n    author    = {Wang, Wenguan and Zhou, Tianfei and Yu, Fisher and Dai, Jifeng and Konukoglu, Ender and Van Gool, Luc},\n    title     = {Exploring Cross-Image Pixel Contrast for Semantic Segmentation},\n    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},\n    year      = {2021},\n    pages     = {7303-7313}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftfzhou%2FContrastiveSeg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftfzhou%2FContrastiveSeg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftfzhou%2FContrastiveSeg/lists"}