{"id":13788540,"url":"https://github.com/sacmehta/ESPNet","last_synced_at":"2025-05-12T03:30:25.508Z","repository":{"id":43902080,"uuid":"121903519","full_name":"sacmehta/ESPNet","owner":"sacmehta","description":"ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation","archived":false,"fork":false,"pushed_at":"2023-06-30T08:57:36.000Z","size":32664,"stargazers_count":534,"open_issues_count":9,"forks_count":111,"subscribers_count":14,"default_branch":"master","last_synced_at":"2024-05-13T22:15:01.128Z","etag":null,"topics":["convolutional-neural-networks","edge-devices","real-time","semantic-segmentation"],"latest_commit_sha":null,"homepage":"https://sacmehta.github.io/ESPNet/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sacmehta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-02-18T00:09:29.000Z","updated_at":"2024-04-24T18:10:42.000Z","dependencies_parsed_at":"2022-09-04T05:21:59.613Z","dependency_job_id":"2c00397f-e59f-412c-a813-a73c752962ec","html_url":"https://github.com/sacmehta/ESPNet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sacmehta%2FESPNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sacmehta%2FESPNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sacmehta%2FESPNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sacmehta%2FESPNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sacmehta","download_url":"https://codeload.github.com/sacmehta/ESPNet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253667883,"owners_count":21944933,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["convolutional-neural-networks","edge-devices","real-time","semantic-segmentation"],"created_at":"2024-08-03T21:00:49.741Z","updated_at":"2025-05-12T03:30:24.514Z","avatar_url":"https://github.com/sacmehta.png","language":"Python","funding_links":[],"categories":["2.) Lightweight Structures"],"sub_categories":["**[Papers]**"],"readme":"#  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation\n\nThis repository contains the source code of our paper, [ESPNet](https://arxiv.org/abs/1803.06815) (accepted for publication in [ECCV'18](http://eccv2018.org/)).\n\n## Sample results\n\nCheck our [project page](https://sacmehta.github.io/ESPNet/) for more qualitative results (videos).\n\nClick on the below sample image to view the segmentation results on YouTube.\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://www.youtube.com/watch?v=bixR11j4WiY\" target=\"_blank\"\u003e\u003cimg src=\"/sample_video/sample.png\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\n## Structure of this repository\nThis repository is organized as:\n* [train](/train/) This directory contains the source code for trainig the ESPNet-C and ESPNet models.\n* [test](/test/) This directory contains the source code for evaluating our model on RGB Images.\n* [pretrained](/pretrained/) This directory contains the pre-trained models on the CityScape dataset\n  * [encoder](/pretrained/encoder/) This directory contains the pretrained **ESPNet-C** models\n  * [decoder](/pretrained/decoder/) This directory contains the pretrained **ESPNet** models\n\n\n## Performance on the CityScape dataset\n\nOur model ESPNet achives an class-wise mIOU of **60.336** and category-wise mIOU of **82.178** on the CityScapes test dataset and runs at \n* 112 fps on the NVIDIA TitanX (30 fps faster than [ENet](https://arxiv.org/abs/1606.02147))\n* 9 FPS on TX2\n* With the same number of parameters as [ENet](https://arxiv.org/abs/1606.02147), our model is **2%** more accurate\n\n## Performance on the CamVid dataset\n\nOur model achieves an mIOU of 55.64 on the CamVid test set. We used the dataset splits (train/val/test) provided [here](https://github.com/alexgkendall/SegNet-Tutorial). We trained the models at a resolution of 480x360. For comparison  with other models, see [SegNet paper](https://ieeexplore.ieee.org/document/7803544/).\n\nNote: We did not use the 3.5K dataset for training which was used in the SegNet paper.\n\n| Model | mIOU | Class avg. | \n| -- | -- | -- |\n| ENet | 51.3 | 68.3 | \n| SegNet | 55.6 | 65.2 | \n| ESPNet | 55.64 | 68.30 | \n\n## Pre-requisite\n\nTo run this code, you need to have following libraries:\n* [OpenCV](https://opencv.org/) - We tested our code with version \u003e 3.0.\n* [PyTorch](http://pytorch.org/) - We tested with v0.3.0\n* Python - We tested our code with Pythonv3. If you are using Python v2, please feel free to make necessary changes to the code. \n\nWe recommend to use [Anaconda](https://conda.io/docs/user-guide/install/linux.html). We have tested our code on Ubuntu 16.04.\n\n## Citation\nIf ESPNet is useful for your research, then please cite our paper.\n```\n@inproceedings{mehta2018espnet,\n  title={ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation},\n  author={Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi},\n  booktitle={ECCV},\n  year={2018}\n}\n```\n\n\n## FAQs\n\n### Assertion error with class labels (t \u003e= 0 \u0026\u0026 t \u003c n_classes).\n\nIf you are getting an assertion error with class labels, then please check the number of class labels defined in the label images. You can do this as:\n\n```\nimport cv2\nimport numpy as np\nlabelImg = cv2.imread(\u003clabel_filename.png\u003e, 0)\nunique_val_arr = np.unique(labelImg)\nprint(unique_val_arr)\n```\nThe values inside *unique_val_arr* should be between 0 and total number of classes in the dataset. If this is not the case, then pre-process your label images. For example, if the label iamge contains 255 as a value, then you can ignore these values by mapping it to an undefined or background class as:\n\n```\nlabelImg[labelImg == 255] = \u003cundefined class id\u003e\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsacmehta%2FESPNet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsacmehta%2FESPNet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsacmehta%2FESPNet/lists"}