https://github.com/nianticlabs/panoptic-forecasting

[CVPR 2021] Forecasting the panoptic segmentation of future video frames
https://github.com/nianticlabs/panoptic-forecasting

cityscapes future-prediction panoptic-segmentation pytorch semantic-segmentation video-semantic-segmentation

Last synced: 6 months ago
JSON representation

[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Host: GitHub
URL: https://github.com/nianticlabs/panoptic-forecasting
Owner: nianticlabs
License: other
Created: 2021-03-28T21:24:48.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2021-09-09T13:55:24.000Z (about 4 years ago)
Last Synced: 2025-03-27T14:51:53.118Z (7 months ago)
Topics: cityscapes, future-prediction, panoptic-segmentation, pytorch, semantic-segmentation, video-semantic-segmentation
Language: Python
Homepage:
Size: 29.1 MB
Stars: 47
Watchers: 4
Forks: 10
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# [Panoptic Segmentation Forecasting](https://arxiv.org/abs/2104.03962)
**Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021**

\[[Link to paper](https://arxiv.org/abs/2104.03962)\]

![Animated gif showing visual comparison of our model's results compared against the hybrid baseline](comparison.gif)

We propose to study the novel task of ‘panoptic segmentation forecasting’: given a set of observed frames, the goal is to forecast the panoptic segmentation for a set of unobserved frames. We also propose a first approach to forecasting future panoptic segmentations. In contrast to typical semantic forecasting, we model the motion of individual object instances and the background separately. This makes instance information persistent during forecasting, and allows us to understand the motion of each moving object.

![Image presenting the model diagram](overview.png "Overview of the model")

## ⚙️ Setup

### Dependencies
- Python 3.7
- PyTorch 1.5.1
- pyyaml
- pandas
- h5py
- opencv
- tensorboard
- tqdm
- [pytorch_scatter 2.0.5](https://github.com/rusty1s/pytorch_scatter)
- [cityscapesscripts](https://github.com/mcordts/cityscapesScripts) (for evaluation)
- [Google Cloud SDK](https://cloud.google.com/sdk/docs/install) (for downloading data/models)

Install the code using the following command:
`pip install -e ./`

### Data
- To run this code, the `gtFine_trainvaltest` dataset will need to be downloaded/decompressed from the [Cityscapes website](https://www.cityscapes-dataset.com/) into the `data/cityscapes/` directory. If you would like to visualize predictions, you will also need to download the `leftImg8bit` dataset.
- Additionally, a few additional Cityscapes ground-truth files will need to be generated. This can be done by running the following commands:
- `python -m cityscapesscripts.preparation.createPanopticImgs --dataset-folder data/cityscapes/gtFine/`
- `CITYSCAPES_DATASET=data/cityscapes/ python -m cityscapesscripts.preparation.createTrainIdLabelImgs`
- The remainder of the required data can be downloaded using the script `download_data.sh`. By default, everything is downloaded into the `data/` directory.
- Training the background model requires generating a version of the semantic segmentation annotations where foreground regions have been removed. This can be done by running the script `scripts/preprocessing/remove_fg_from_gt.sh`.
- Training the foreground model requires additionally downloading a pretrained MaskRCNN model. This can be found at [this link](https://dl.fbaipublicfiles.com/detectron2/Cityscapes/mask_rcnn_R_50_FPN/142423278/model_final_af9cf5.pkl). This should be saved as `pretrained_models/fg/mask_rcnn_pretrain.pkl`.
- Training the background model requires additionally downloading a pretrained HarDNet model. This can be found at [this link](https://ping-chao.com/hardnet/hardnet70_cityscapes_model.pkl). This should be saved as `pretrained_models/bg/hardnet70_cityscapes_model.pkl`.

## Running our code
The `scripts` directory contains scripts which can be used to train and evaluate the foreground, background, and egomotion models. **Note that these scripts should be run from the root project directory as shown below**. Specifically:
- `scripts/odom/run_odom_train.sh` trains the egomotion prediction model.
- `scripts/odom/export_odom.sh` exports the odometry predictions, which can then be used during evaluation by other models
- `scripts/bg/run_bg_train.sh` trains the background prediction model.
- `scripts/bg/run_export_bg_val.sh` exports predictions make by the background using input reprojected point clouds which come from using predicted egomotion.
- `scripts/fg/run_fg_train.sh` trains the foreground prediction model.
- `scripts/fg/run_fg_eval_panoptic.sh` produces final panoptic semgnetation predictions based on the trained foreground model and exported background predictions. This also uses predicted egomotion as input. **Note that the background export script must be run before this one so that the full panoptic segmentation outputs can be generated.** Also, if you re-run this script, make sure to delete the predictions in the folder `experiments/pretrained_fg/exported_panoptics_*_val/` first, as otherwise the generated json file will not contain entries for the sequences where foreground instances are not present.

We provide our pretrained foreground, background, and egomotion prediction models. The data downloading script additionally downloads these models into the directory `pretrained_models/`

## ✏️ 📄 Citation

If you found our work relevant to yours, please consider citing our paper:
```
@inproceedings{graber-2021-panopticforecasting,
title = {Panoptic Segmentation Forecasting},
author = {Colin Graber and
Grace Tsai and
Michael Firman and
Gabriel Brostow and
Alexander Schwing},
booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
year = {2021}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nianticlabs/panoptic-forecasting

Awesome Lists containing this project

README