Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sacmehta/ESPNet
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
https://github.com/sacmehta/ESPNet
convolutional-neural-networks edge-devices real-time semantic-segmentation
Last synced: 25 days ago
JSON representation
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
- Host: GitHub
- URL: https://github.com/sacmehta/ESPNet
- Owner: sacmehta
- License: mit
- Created: 2018-02-18T00:09:29.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-06-30T08:57:36.000Z (over 1 year ago)
- Last Synced: 2024-05-13T22:15:01.128Z (7 months ago)
- Topics: convolutional-neural-networks, edge-devices, real-time, semantic-segmentation
- Language: Python
- Homepage: https://sacmehta.github.io/ESPNet/
- Size: 31.2 MB
- Stars: 534
- Watchers: 14
- Forks: 111
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-AutoML-and-Lightweight-Models - sacmehta/ESPNet
README
# ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
This repository contains the source code of our paper, [ESPNet](https://arxiv.org/abs/1803.06815) (accepted for publication in [ECCV'18](http://eccv2018.org/)).
## Sample results
Check our [project page](https://sacmehta.github.io/ESPNet/) for more qualitative results (videos).
Click on the below sample image to view the segmentation results on YouTube.
## Structure of this repository
This repository is organized as:
* [train](/train/) This directory contains the source code for trainig the ESPNet-C and ESPNet models.
* [test](/test/) This directory contains the source code for evaluating our model on RGB Images.
* [pretrained](/pretrained/) This directory contains the pre-trained models on the CityScape dataset
* [encoder](/pretrained/encoder/) This directory contains the pretrained **ESPNet-C** models
* [decoder](/pretrained/decoder/) This directory contains the pretrained **ESPNet** models## Performance on the CityScape dataset
Our model ESPNet achives an class-wise mIOU of **60.336** and category-wise mIOU of **82.178** on the CityScapes test dataset and runs at
* 112 fps on the NVIDIA TitanX (30 fps faster than [ENet](https://arxiv.org/abs/1606.02147))
* 9 FPS on TX2
* With the same number of parameters as [ENet](https://arxiv.org/abs/1606.02147), our model is **2%** more accurate## Performance on the CamVid dataset
Our model achieves an mIOU of 55.64 on the CamVid test set. We used the dataset splits (train/val/test) provided [here](https://github.com/alexgkendall/SegNet-Tutorial). We trained the models at a resolution of 480x360. For comparison with other models, see [SegNet paper](https://ieeexplore.ieee.org/document/7803544/).
Note: We did not use the 3.5K dataset for training which was used in the SegNet paper.
| Model | mIOU | Class avg. |
| -- | -- | -- |
| ENet | 51.3 | 68.3 |
| SegNet | 55.6 | 65.2 |
| ESPNet | 55.64 | 68.30 |## Pre-requisite
To run this code, you need to have following libraries:
* [OpenCV](https://opencv.org/) - We tested our code with version > 3.0.
* [PyTorch](http://pytorch.org/) - We tested with v0.3.0
* Python - We tested our code with Pythonv3. If you are using Python v2, please feel free to make necessary changes to the code.We recommend to use [Anaconda](https://conda.io/docs/user-guide/install/linux.html). We have tested our code on Ubuntu 16.04.
## Citation
If ESPNet is useful for your research, then please cite our paper.
```
@inproceedings{mehta2018espnet,
title={ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation},
author={Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi},
booktitle={ECCV},
year={2018}
}
```## FAQs
### Assertion error with class labels (t >= 0 && t < n_classes).
If you are getting an assertion error with class labels, then please check the number of class labels defined in the label images. You can do this as:
```
import cv2
import numpy as np
labelImg = cv2.imread(, 0)
unique_val_arr = np.unique(labelImg)
print(unique_val_arr)
```
The values inside *unique_val_arr* should be between 0 and total number of classes in the dataset. If this is not the case, then pre-process your label images. For example, if the label iamge contains 255 as a value, then you can ignore these values by mapping it to an undefined or background class as:```
labelImg[labelImg == 255] =
```