https://github.com/Valkyrja3607/MaskDiffusion

Code for ''MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation''
https://github.com/Valkyrja3607/MaskDiffusion

Last synced: 7 months ago
JSON representation

Code for ''MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation''

Host: GitHub
URL: https://github.com/Valkyrja3607/MaskDiffusion
Owner: Valkyrja3607
Created: 2024-03-15T12:38:20.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-03-23T15:07:12.000Z (over 1 year ago)
Last Synced: 2024-08-01T18:38:33.827Z (about 1 year ago)
Language: Python
Homepage: https://valkyrja3607.github.io/MaskDiffusion/
Size: 4.28 MB
Stars: 15
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-diffusion-categorized - [Code

README

# MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation
This repo is the official implementation of
["MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation"](https://arxiv.org/abs/2403.11194).

MaskDiffusion is an innovative approach to open-vocabulary semantic segmentation that utilizes pretrained frozen
Stable Diffusion models. It overcomes the challenges of traditional semantic segmentation methods by not requiring
additional training or annotation. MaskDiffusion significantly enhances performance over similar methods and excels
in handling open vocabularies, including detailed and specific categories like fine-grained and proper nouns.
Demonstrating substantial improvements in both qualitative and quantitative metrics against comparable unsupervised
segmentation approaches, MaskDiffusion sets a new benchmark, as evidenced by its performance on datasets like Potsdam
and COCO-Stuff. The approach promises to advance the field of computer vision by making semantic segmentation more
accessible and versatile.

![Alt text](resources/overview.png)

Our results highlight the successful segmentation of challenging concepts such as 'mirror' in (a) and rare segmentation
tasks such as 'astronaut' in (b). Additionally, the model demonstrates the capability to identify general classes, as
depicted in (c), indicating that its segmentation performance improves with more general classes. Impressively, the
segmentation of proper nouns is also achievable, as evidenced in results (d) and (e).

![Alt text](resources/open_vocabulary.png)

## Installation

Install with Docker.

```sh
make build
```

To install docker to your environment, please refer to [this repository](https://github.com/Valkyrja3607/docker-template).

## Dataset

Please set DATASET_DIR variable in Makefile to directory your dataset is containing.
This implementation use Cityscapes as default dataset.
The following structure is assumed.

```sh
── datasets
├── cityscapes
├── cocostuff
└── VOCdevkit
```

## Run

To run MaskDiffusion code, please run like below.

```sh
make run
```

To enter docker container, please run like below.

```sh
make bash
```

## Result

Result will be contained under outputs directory.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Valkyrja3607/MaskDiffusion

Awesome Lists containing this project

README