Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hustvl/symphonies
[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
https://github.com/hustvl/symphonies
3d-perception 3d-scene-understanding autonomous-driving cvpr2024 occupancy-prediction semantic-scene-completion
Last synced: 3 days ago
JSON representation
[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
- Host: GitHub
- URL: https://github.com/hustvl/symphonies
- Owner: hustvl
- License: mit
- Created: 2023-05-25T06:46:18.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-24T05:45:30.000Z (5 months ago)
- Last Synced: 2024-06-24T06:49:33.391Z (5 months ago)
- Topics: 3d-perception, 3d-scene-understanding, autonomous-driving, cvpr2024, occupancy-prediction, semantic-scene-completion
- Language: Python
- Homepage: https://arxiv.org/abs/2306.15670
- Size: 66.9 MB
- Stars: 138
- Watchers: 9
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Symphonies (Scene-from-Insts) 🎻
### [Symphonize 3D Semantic Scene Completion with Contextual Instance Queries](https://arxiv.org/abs/2306.15670)
[Haoyi Jiang](https://github.com/npurson) 1,✢,
[Tianheng Cheng](https://scholar.google.com/citations?user=PH8rJHYAAAAJ) 1,✢,
Naiyu Gao 2,
Haoyang Zhang 2,
[Tianwei Lin](https://wzmsltw.github.io/) 2,
[Wenyu Liu](http://eic.hust.edu.cn/professor/liuwenyu/) 1,
[Xinggang Wang](https://xwcv.github.io/) 1,✉️
1 [School of EIC, HUST](http://english.eic.hust.edu.cn/),
2 [Horizon Robotics](https://en.horizonrobotics.com/)[**CVPR 2024**](https://openaccess.thecvf.com/content/CVPR2024/papers/Jiang_Symphonize_3D_Semantic_Scene_Completion_with_Contextual_Instance_Queries_CVPR_2024_paper.pdf)
[![arXiv](https://img.shields.io/badge/arXiv-2306.15670-red?logo=arXiv&logoColor=red)](https://arxiv.org/abs/2306.15670)
[![License: MIT](https://img.shields.io/github/license/hustvl/symphonies)](LICENSE)[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/symphonize-3d-semantic-scene-completion-with/3d-semantic-scene-completion-from-a-single-1)](https://paperswithcode.com/sota/3d-semantic-scene-completion-from-a-single-1?p=symphonize-3d-semantic-scene-completion-with)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/symphonize-3d-semantic-scene-completion-with/3d-semantic-scene-completion-on-kitti-360)](https://paperswithcode.com/sota/3d-semantic-scene-completion-on-kitti-360?p=symphonize-3d-semantic-scene-completion-with)
**TL;DR:** Our paper delve into enhancing SSC through the utilization of instance-centric representations. We propose a novel paradigm that integrates ***instance queries*** to facilitate ***instance semantics*** and capture ***global context***. Our approach achieves SOTA results of ***15.04 & 18.58 mIoU*** on the SemanticKITTI & KITTI-360, respectively.
This project is built upon ***[TmPL](https://github.com/npurson/tmpl)***, a template for rapid & flexible DL experimentation development built upon [Lightning](https://lightning.ai/) & [Hydra](https://hydra.cc/).
![arch](assets/arch.png)
![vis](assets/vis.png)
![vis](assets/demo.gif)
## News
* ***Feb 27 '24***: **Our paper has been accepted at CVPR 2024. 🎉**
* ***Nov 22 '23***: We have updated our paper on [arXiv](https://arxiv.org/abs/2306.15670) with the latest results.
* ***Sep 18 '23***: We have achieved state-of-the-art results on the recently published SSCBench-KITTI-360 benchmark.
* ***Jun 28 '23***: We have released the [arXiv paper](https://arxiv.org/abs/2306.15670) of Symphonies.## Preliminary
### Installation
1. Install PyTorch and Torchvision referring to https://pytorch.org/get-started/locally/.
2. Install MMDetection referring to https://mmdetection.readthedocs.io/en/latest/get_started.html#installation.
3. Install the rest of the requirements with pip.```bash
pip install -r requirements.txt
```### Dataset Preparation
#### 1. Download the Data
**SemanticKITTI:** Download the RGB images, calibration files, and preprocess the labels, referring to the documentation of [VoxFormer](https://github.com/NVlabs/VoxFormer/blob/main/docs/prepare_dataset.md) or [MonoScene](https://github.com/astra-vision/MonoScene#semantickitti).
**SSCBench-KITTI-360:** Refer to https://github.com/ai4ce/SSCBench/tree/main/dataset/KITTI-360.
#### 2. Generate Depth Predications
**SemanticKITTI:** Generate depth predications with pre-trained MobileStereoNet referring to VoxFormer https://github.com/NVlabs/VoxFormer/tree/main/preprocess#3-image-to-depth.
**SSCBench-KITTI-360:** Follow the same procedure as SemanticKITTI but ensure to [adapt the disparity value](https://github.com/ai4ce/SSCBench/issues/8#issuecomment-1674607576).
### Pretrained Weights
The pretrained weight of MaskDINO can be downloaded [here](https://github.com/hustvl/Symphonies/releases/download/v1.0/maskdino_r50_50e_300q_panoptic_pq53.0.pth).
## Usage
0. **Setup**
```shell
export PYTHONPATH=`pwd`:$PYTHONPATH
```1. **Training**
```shell
python tools/train.py [--config-name config[.yaml]] [trainer.devices=4] \
[+data_root=$DATA_ROOT] [+label_root=$LABEL_ROOT] [+depth_root=$DEPTH_ROOT]
```* Override the default config file with `--config-name`.
* You can also override any value in the loaded config from the command line, refer to the following for more infomation.
* https://hydra.cc/docs/tutorials/basic/your_first_app/config_file/
* https://hydra.cc/docs/advanced/hydra-command-line-flags/
* https://hydra.cc/docs/advanced/override_grammar/basic/2. **Testing**
Generate the outputs for submission on the evaluation server:
```shell
python tools/test.py [+ckpt_path=...]
```3. **Visualization**
1. Generating outputs
```shell
python tools/generate_outputs.py [+ckpt_path=...]
```2. Visualization
```shell
python tools/visualize.py [+path=...]
```## Results
1. **SemanticKITTI**
| Method | Split | IoU | mIoU | Download |
| :------------------------------------------: | :---: | :---: | :---: | :----------------------: |
| [Symphonies](symphonies/configs/config.yaml) | val | 41.92 | 14.89 | [log](https://github.com/hustvl/Symphonies/releases/download/v1.0/semantic_kitti.log) / [model](https://github.com/hustvl/Symphonies/releases/download/v1.0/semantic_kitti_e25_miou0.1489.ckpt) |
| [Symphonies](symphonies/configs/config.yaml) | test | 42.19 | 15.04 | [output](https://github.com/hustvl/Symphonies/releases/download/v1.0/scoring_output.txt) |2. **KITTI-360**
| Method | Split | IoU | mIoU | Download |
| :------------------------------------------: | :---: | :---: | :---: | :----------------------: |
| [Symphonies](symphonies/configs/config.yaml) | test | 44.12 | 18.58 | [log](https://github.com/hustvl/Symphonies/releases/download/v1.0/kitti_360.log) / [model](https://github.com/hustvl/Symphonies/releases/download/v1.0/kitti_360_e26_miou0.1836.ckpt) |## Citation
If you find our paper and code useful for your research, please consider giving this repo a star :star: or citing :pencil::
```BibTeX
@article{jiang2023symphonies,
title={Symphonize 3D Semantic Scene Completion with Contextual Instance Queries},
author={Haoyi Jiang and Tianheng Cheng and Naiyu Gao and Haoyang Zhang and Tianwei Lin and Wenyu Liu and Xinggang Wang},
journal={CVPR},
year={2024}
}
```## Acknowledgements
The development of this project is inspired and informed by [MonoScene](https://github.com/astra-vision/MonoScene), [MaskDINO](https://github.com/IDEA-Research/MaskDINO) and [VoxFormer](https://github.com/NVlabs/VoxFormer). We are thankful to build upon the pioneering work of these projects.
## License
Released under the [MIT](LICENSE) License.