{"id":13653476,"url":"https://github.com/NVIDIA/semantic-segmentation","last_synced_at":"2025-04-23T06:31:45.806Z","repository":{"id":41329820,"uuid":"191994859","full_name":"NVIDIA/semantic-segmentation","owner":"NVIDIA","description":"Nvidia Semantic Segmentation monorepo","archived":false,"fork":false,"pushed_at":"2021-07-26T23:48:15.000Z","size":4272,"stargazers_count":1789,"open_issues_count":87,"forks_count":389,"subscribers_count":34,"default_branch":"main","last_synced_at":"2025-04-08T09:09:29.384Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NVIDIA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-06-14T19:20:20.000Z","updated_at":"2025-04-04T09:13:34.000Z","dependencies_parsed_at":"2022-09-01T06:40:23.053Z","dependency_job_id":null,"html_url":"https://github.com/NVIDIA/semantic-segmentation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fsemantic-segmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fsemantic-segmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fsemantic-segmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NVIDIA%2Fsemantic-segmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NVIDIA","download_url":"https://codeload.github.com/NVIDIA/semantic-segmentation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250385117,"owners_count":21421857,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T02:01:10.975Z","updated_at":"2025-04-23T06:31:40.795Z","avatar_url":"https://github.com/NVIDIA.png","language":"Python","funding_links":[],"categories":["Python","Frameworks for segmentation","Paper implementations｜论文实现","Model Deployment library","Video Understanding","Paper implementations"],"sub_categories":["Domain-specific frameworks","Other libraries｜其他库:","PyTorch \u003ca name=\"pytorch\"/\u003e","3D SemanticSeg","Other libraries:"],"readme":"### [Paper](https://arxiv.org/abs/2005.10821) | [YouTube](https://youtu.be/odAGA7pFBGA)  | [Cityscapes Score](https://www.cityscapes-dataset.com/method-details/?submissionID=7836) \u003cbr\u003e\n\nPytorch implementation of our paper [Hierarchical Multi-Scale Attention for Semantic Segmentation](https://arxiv.org/abs/2005.10821).\u003cbr\u003e\n\nPlease refer to the `sdcnet` branch if you are looking for the code corresponding to [Improving Semantic Segmentation via Video Prediction and Label Relaxation](https://nv-adlr.github.io/publication/2018-Segmentation).\n\n## Installation \n\n* The code is tested with pytorch 1.3 and python 3.6\n* You can use ./Dockerfile to build an image.\n\n\n## Download Weights\n\n* Create a directory where you can keep large files. Ideally, not in this directory.\n```bash\n  \u003e mkdir \u003clarge_asset_dir\u003e\n```\n\n* Update `__C.ASSETS_PATH` in `config.py` to point at that directory\n\n  __C.ASSETS_PATH=\u003clarge_asset_dir\u003e\n\n* Download pretrained weights from [google drive](https://drive.google.com/open?id=1fs-uLzXvmsISbS635eRZCc5uzQdBIZ_U) and put into `\u003clarge_asset_dir\u003e/seg_weights`\n\n## Download/Prepare Data\n\nIf using Cityscapes, download Cityscapes data, then update `config.py` to set the path:\n```python\n__C.DATASET.CITYSCAPES_DIR=\u003cpath_to_cityscapes\u003e\n```\n\n* Download Autolabelled-Data from [google drive](https://drive.google.com/file/d/1DtPo-WP-hjaOwsbj6ZxTtOo_7R_4TKRG/view?usp=sharing)\n\nIf using Cityscapes Autolabelled Images, download Cityscapes data, then update `config.py` to set the path:\n```python\n__C.DATASET.CITYSCAPES_CUSTOMCOARSE=\u003cpath_to_cityscapes\u003e\n```\n\nIf using Mapillary, download Mapillary data, then update `config.py` to set the path:\n```python\n__C.DATASET.MAPILLARY_DIR=\u003cpath_to_mapillary\u003e\n```\n\n\n## Running the code\n\nThe instructions below make use of a tool called `runx`, which we find useful to help automate experiment running and summarization. For more information about this tool, please see [runx](https://github.com/NVIDIA/runx).\nIn general, you can either use the runx-style commandlines shown below. Or you can call `python train.py \u003cargs ...\u003e` directly if you like.\n\n\n### Run inference on Cityscapes\n\nDry run:\n```bash\n\u003e python -m runx.runx scripts/eval_cityscapes.yml -i -n\n```\nThis will just print out the command but not run. It's a good way to inspect the commandline. \n\nReal run:\n```bash\n\u003e python -m runx.runx scripts/eval_cityscapes.yml -i\n```\n\nThe reported IOU should be 86.92. This evaluates with scales of 0.5, 1.0. and 2.0. You will find evaluation results in ./logs/eval_cityscapes/...\n\n### Run inference on Mapillary\n\n```bash\n\u003e python -m runx.runx scripts/eval_mapillary.yml -i\n```\n\nThe reported IOU should be 61.05. Note that this must be run on a 32GB node and the use of 'O3' mode for amp is critical in order to avoid GPU out of memory. Results in logs/eval_mapillary/...\n\n### Dump images for Cityscapes\n\n```bash\n\u003e python -m runx.runx scripts/dump_cityscapes.yml -i\n```\n\nThis will dump network output and composited images from running evaluation with the Cityscapes validation set. \n\n### Run inference and dump images on a folder of images\n\n```bash\n\u003e python -m runx.runx scripts/dump_folder.yml -i\n```\n\nYou should end up seeing images that look like the following:\n\n![alt text](imgs/composited_sf.png \"example inference, composited\")\n\n## Train a model\n\nTrain cityscapes, using HRNet + OCR + multi-scale attention with fine data and mapillary-pretrained model\n```bash\n\u003e python -m runx.runx scripts/train_cityscapes.yml -i\n```\n\nThe first time this command is run, a centroid file has to be built for the dataset. It'll take about 10 minutes. The centroid file is used during training to know how to sample from the dataset in a class-uniform way.\n\nThis training run should deliver a model that achieves 84.7 IOU.\n\n## Train SOTA default train-val split\n```bash\n\u003e python -m runx.runx  scripts/train_cityscapes_sota.yml -i\n```\nAgain, use `-n` to do a dry run and just print out the command. This should result in a model with 86.8 IOU. If you run out of memory, try to lower the crop size or turn off rmi_loss.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA%2Fsemantic-segmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FNVIDIA%2Fsemantic-segmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FNVIDIA%2Fsemantic-segmentation/lists"}