{"id":13497323,"url":"https://github.com/mapillary/seamseg","last_synced_at":"2025-03-28T21:32:23.833Z","repository":{"id":43456700,"uuid":"190725310","full_name":"mapillary/seamseg","owner":"mapillary","description":"Seamless Scene Segmentation","archived":false,"fork":false,"pushed_at":"2024-08-07T10:26:54.000Z","size":781,"stargazers_count":293,"open_issues_count":19,"forks_count":52,"subscribers_count":18,"default_branch":"main","last_synced_at":"2024-10-31T13:34:21.691Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mapillary.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-07T10:22:50.000Z","updated_at":"2024-10-28T18:14:20.000Z","dependencies_parsed_at":"2024-10-31T13:32:07.761Z","dependency_job_id":"51eb038a-4013-4e78-98c8-b9fdea696a75","html_url":"https://github.com/mapillary/seamseg","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapillary%2Fseamseg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapillary%2Fseamseg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapillary%2Fseamseg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapillary%2Fseamseg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mapillary","download_url":"https://codeload.github.com/mapillary/seamseg/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246105567,"owners_count":20724332,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T20:00:28.903Z","updated_at":"2025-03-28T21:32:23.453Z","avatar_url":"https://github.com/mapillary.png","language":"Python","funding_links":[],"categories":["SemanticSeg"],"sub_categories":[],"readme":"# Seamless Scene Segmentation\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"network.png\" width=\"100%\"/\u003e\n\u003cbr\u003e\n\u003ca href=\"http://openaccess.thecvf.com/content_CVPR_2019/html/Porzi_Seamless_Scene_Segmentation_CVPR_2019_paper.html\"\u003eCVPR\u003c/a\u003e\n|\n\u003ca href=\"https://arxiv.org/abs/1905.01220\"\u003earXiv\u003c/a\u003e\n\u003c/p\u003e\n\nSeamless Scene Segmentation is a CNN-based architecture that can be trained end-to-end to predict a complete class- and\ninstance-specific labeling for each pixel in an image. To tackle this task, also known as \"Panoptic Segmentation\", we take\nadvantage of a novel segmentation head that seamlessly integrates multi-scale features generated by a Feature Pyramid\nNetwork with contextual information conveyed by a light-weight DeepLab-like module.\n\nThis repository currently contains training and evaluation code for Seamless Scene Segmentation in PyTorch, based on our re-implementation of Mask R-CNN. \n\nIf you use Seamless Scene Segmentation in your research, please cite:\n```bibtex\n@InProceedings{Porzi_2019_CVPR,\n  author = {Porzi, Lorenzo and Rota Bul\\`o, Samuel and Colovic, Aleksander and Kontschieder, Peter},\n  title = {Seamless Scene Segmentation},\n  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n  month = {June},\n  year = {2019}\n}\n```\n\n## Requirements and setup\n\nMain system requirements:\n* CUDA 10.1\n* Linux with GCC 7 or 8\n* PyTorch v1.1.0\n\n**IMPORTANT NOTE**: These requirements are not necessarily stringent, e.g. it might be possible to compile with older\nversions of CUDA, or under Windows. However, we have only tested the code under the above settings and cannot provide support for other setups.\n\n**IMPORTANT NOTE 2**: Due to some breaking changes in the handling of boolean operations, seamseg is currently not compatible with Pytorch v1.2.0 or newer.\n\nTo install PyTorch, please refer to https://github.com/pytorch/pytorch#installation.\n\nTo install all other dependencies using pip:\n```bash\npip install -r requirements.txt\n```\n\n### Setup\n\nOur code is split into two main components: a library containing implementations for the various network modules,\nalgorithms and utilities, and a set of scripts to train / test the networks.\n\nThe library, called `seamseg`, can be installed with:\n```bash\ngit clone https://github.com/mapillary/seamseg.git\ncd seamseg\npython setup.py install\n```\nor, in a single line:\n```bash\npip install git+https://github.com/mapillary/seamseg.git\n```\n\nThe scripts do not require installation (but they *do* require `seamseg` to be installed), and can be run\nfrom the `scripts/` folder. *Note:* Do not run the scripts from the main folder of this repo, otherwise python might\ndecide to load the local copy of the `seamseg` package instead of the one installed above, causing issues.\n\n## Trained models\n\nThe model files provided below are made available under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.\n\n| Model | PQ | Link + md5 |\n|-------|----|------------|\n| SeamSeg ResNet50, Mapillary Vistas | 37.99 | [7046e54e54e9dcc38060b150e97f4a5a][1] |\n\n[1]: https://drive.google.com/file/d/1ULhd_CZ24L8FnI9lZ2H6Xuf03n6NA_-Y/view\n\nThe files linked above are `zip` archives, each containing model weights (`.tar` file), configuration parameters (`config.ini` file) and the metadata file of the dataset the model was trained on (`metadata.bin` file).\nTo use a model, unzip it somewhere and follow the instructions in the [Running inference\"](#running-inference) section below.\n\n## Using the scripts\n\nOur code uses an intermediate data format to ease training on multiple datasets, described\n[here](https://github.com/mapillary/seamseg/wiki/Seamless-Scene-Segmentation-dataset-format).\nWe provide pre-made scripts to convert from [Cityscapes](scripts/data_preparation/prepare_cityscapes.py) and\n[Mapillary Vistas](scripts/data_preparation/prepare_vistas.py) to our format.\n\nWhen training, unless explicitly training from scratch, it's also necessary to convert the ImageNet pre-trained weights\nprovided by PyTorch to our network format.\nTo do this, simply run:\n```bash\ncd scripts/utility\npython convert_pytorch_resnet.py NET_NAME OUTPUT_FILE\n```\nwhere `NET_NAME` is one of `resnet18`, `resnet34`, `resnet50`, `resnet101` or `resnet152`.\n\n### Training\n\nTraining involves three main steps: Preparing the dataset, creating a configuration file and running the training\nscript.\nTo prepare the dataset, refer to the format description [here](https://github.com/mapillary/seamseg/wiki/Seamless-Scene-Segmentation-dataset-format), or\nuse one of the scripts in [scripts/data_preparation](scripts/data_preparation).\nThe configuration file is a simple text file in `ini` format.\nThe default value of each configuration parameter, as well as a short description of what it does, is available in\n[seamseg/config/defaults](seamseg/config/defaults).\n**Note** that these are just an indication of what a \"reasonable\" value for each parameter could be, and are not\nmeant as a way to reproduce any of the results from our paper.\n\nTo launch the training:\n```bash\ncd scripts\npython -m torch.distributed.launch --nproc_per_node=N_GPUS train_panoptic.py --log_dir LOG_DIR CONFIG DATA_DIR \n```\nNote that, for now, our code **must** be launched in \"distributed\" mode using PyTorch's `torch.distributed.launch`\nutility.\nIt's also highly recommended to train on multiple GPUs (possibly 4-8) in order to obtain good results.\nTraining logs, both in text and Tensorboard formats, will be written in `LOG_DIR`.\n\nThe validation metrics reported in the logs include mAP, PQ and mIOU, computed as follows:\n* For mAP (both mask and bounding box), we resort to the original implementation from the\n[COCO API](https://github.com/cocodataset/cocoapi). This is the reason why our dataset format also includes COCO-format\nannotations.\n* For PQ (Panoptic Quality) and mIOU we use our own implementations. Our PQ metric has been verified to produce\nresults that are equivalent to the [official implementation](https://github.com/cocodataset/panopticapi), minus\nnumerical differences.\n\n#### Training with the Vistas settings from our paper:\n```bash\ncd scripts\npython -m torch.distributed.launch --nproc_per_node=8 \\\n    train_panoptic.py --log_dir LOG_DIR \\\n    configurations/vistas_r50.ini DATA_DIR\n```\n\n#### Training with the Cityscapes settings from our paper:\n```bash\ncd scripts\npython -m torch.distributed.launch --nproc_per_node=8 \\\n    train_panoptic.py --log_dir LOG_DIR \\\n    configurations/cityscapes_r50.ini DATA_DIR\n```\n\n### Running inference\n\nGiven a trained network, inference can be run on any set of images using\n[scripts/test_panoptic.py](scripts/test_panoptic.py):\n```bash\ncd scripts\npython -m torch.distributed.launch --nproc_per_node=N_GPUS test_panoptic.py --meta METADATA --log_dir LOG_DIR CONFIG MODEL INPUT_DIR OUTPUT_DIR\n```\nImages (either `png` or `jpg`) will be read from `INPUT_DIR` and recursively in all subfolders, and predictions will be\nwritten to `OUTPUT_DIR`.\nThe script also requires to be given the `metadata.bin` file of the dataset the network was originally trained on.\nNote that the script will only read from the `\"meta\"` section, meaning that a stripped-down version of `metadata.bin`,\ni.e. without the `\"images\"` section, can also be used.\n\nBy default, the test scripts output \"qualitative\" results, i.e. the original images superimposed with their panoptic segmentation.\nThis can be changed by setting the `--raw` flag: in this case, the script will output, for each image, the \"raw\" network\noutput as a PyTorch `.pth.tar` file.\nAn additional script to process these raw outputs into COCO-format panoptic predictions will be released soon.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmapillary%2Fseamseg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmapillary%2Fseamseg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmapillary%2Fseamseg/lists"}