https://github.com/sacmehta/ESPNet

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
https://github.com/sacmehta/ESPNet

convolutional-neural-networks edge-devices real-time semantic-segmentation

Last synced: 2 months ago
JSON representation

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

Host: GitHub
URL: https://github.com/sacmehta/ESPNet
Owner: sacmehta
License: mit
Created: 2018-02-18T00:09:29.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2023-06-30T08:57:36.000Z (about 2 years ago)
Last Synced: 2024-05-13T22:15:01.128Z (about 1 year ago)
Topics: convolutional-neural-networks, edge-devices, real-time, semantic-segmentation
Language: Python
Homepage: https://sacmehta.github.io/ESPNet/
Size: 31.2 MB
Stars: 534
Watchers: 14
Forks: 111
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-AutoML-and-Lightweight-Models - sacmehta/ESPNet

README

        #  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

This repository contains the source code of our paper, [ESPNet](https://arxiv.org/abs/1803.06815) (accepted for publication in [ECCV'18](http://eccv2018.org/)).

## Sample results

Check our [project page](https://sacmehta.github.io/ESPNet/) for more qualitative results (videos).

Click on the below sample image to view the segmentation results on YouTube.







## Structure of this repository

This repository is organized as:

* [train](/train/) This directory contains the source code for trainig the ESPNet-C and ESPNet models.

* [test](/test/) This directory contains the source code for evaluating our model on RGB Images.

* [pretrained](/pretrained/) This directory contains the pre-trained models on the CityScape dataset

  * [encoder](/pretrained/encoder/) This directory contains the pretrained **ESPNet-C** models

  * [decoder](/pretrained/decoder/) This directory contains the pretrained **ESPNet** models

## Performance on the CityScape dataset

Our model ESPNet achives an class-wise mIOU of **60.336** and category-wise mIOU of **82.178** on the CityScapes test dataset and runs at 

* 112 fps on the NVIDIA TitanX (30 fps faster than [ENet](https://arxiv.org/abs/1606.02147))

* 9 FPS on TX2

* With the same number of parameters as [ENet](https://arxiv.org/abs/1606.02147), our model is **2%** more accurate

## Performance on the CamVid dataset

Our model achieves an mIOU of 55.64 on the CamVid test set. We used the dataset splits (train/val/test) provided [here](https://github.com/alexgkendall/SegNet-Tutorial). We trained the models at a resolution of 480x360. For comparison  with other models, see [SegNet paper](https://ieeexplore.ieee.org/document/7803544/).

Note: We did not use the 3.5K dataset for training which was used in the SegNet paper.

| Model | mIOU | Class avg. | 

| -- | -- | -- |

| ENet | 51.3 | 68.3 | 

| SegNet | 55.6 | 65.2 | 

| ESPNet | 55.64 | 68.30 | 

## Pre-requisite

To run this code, you need to have following libraries:

* [OpenCV](https://opencv.org/) - We tested our code with version > 3.0.

* [PyTorch](http://pytorch.org/) - We tested with v0.3.0

* Python - We tested our code with Pythonv3. If you are using Python v2, please feel free to make necessary changes to the code. 

We recommend to use [Anaconda](https://conda.io/docs/user-guide/install/linux.html). We have tested our code on Ubuntu 16.04.

## Citation

If ESPNet is useful for your research, then please cite our paper.

```

@inproceedings{mehta2018espnet,

  title={ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation},

  author={Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi},

  booktitle={ECCV},

  year={2018}

}

```

## FAQs

### Assertion error with class labels (t >= 0 && t < n_classes).

If you are getting an assertion error with class labels, then please check the number of class labels defined in the label images. You can do this as:

```

import cv2

import numpy as np

labelImg = cv2.imread(, 0)

unique_val_arr = np.unique(labelImg)

print(unique_val_arr)

```

The values inside *unique_val_arr* should be between 0 and total number of classes in the dataset. If this is not the case, then pre-process your label images. For example, if the label iamge contains 255 as a value, then you can ignore these values by mapping it to an undefined or background class as:

```

labelImg[labelImg == 255] = 

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sacmehta/ESPNet

Awesome Lists containing this project

README