{"id":25432354,"url":"https://github.com/drprojects/superpoint_transformer","last_synced_at":"2025-05-15T05:06:48.534Z","repository":{"id":175812813,"uuid":"654209013","full_name":"drprojects/superpoint_transformer","owner":"drprojects","description":"Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] \"Efficient 3D Semantic Segmentation with Superpoint Transformer\" and SuperCluster introduced in [3DV'24 Oral] \"Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering\"","archived":false,"fork":false,"pushed_at":"2025-03-13T10:09:17.000Z","size":27286,"stargazers_count":759,"open_issues_count":1,"forks_count":99,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-14T19:55:01.706Z","etag":null,"topics":["3d","3dv2024","deep-learning","efficient","fast","graph-clustering","hierarchical","iccv2023","lightweight","panoptic-segmentation","partition","partitioning","point-cloud","pytorch","semantic-segmentation","superpoint","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/drprojects.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-15T16:02:44.000Z","updated_at":"2025-04-11T09:23:57.000Z","dependencies_parsed_at":"2025-02-24T05:00:41.032Z","dependency_job_id":"27b17e1f-0889-4ff0-948b-05d4734c8eb7","html_url":"https://github.com/drprojects/superpoint_transformer","commit_stats":{"total_commits":39,"total_committers":4,"mean_commits":9.75,"dds":"0.20512820512820518","last_synced_commit":"89dfa851f939cb25b7258fbc7e6b48eb3c7c18a2"},"previous_names":["drprojects/superpoint_transformer"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drprojects%2Fsuperpoint_transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drprojects%2Fsuperpoint_transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drprojects%2Fsuperpoint_transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drprojects%2Fsuperpoint_transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/drprojects","download_url":"https://codeload.github.com/drprojects/superpoint_transformer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254276447,"owners_count":22043867,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d","3dv2024","deep-learning","efficient","fast","graph-clustering","hierarchical","iccv2023","lightweight","panoptic-segmentation","partition","partitioning","point-cloud","pytorch","semantic-segmentation","superpoint","transformer"],"created_at":"2025-02-17T04:50:07.026Z","updated_at":"2025-05-15T05:06:43.512Z","avatar_url":"https://github.com/drprojects.png","language":"Python","readme":"from torch.utils.data import DataLoader\u003cdiv align=\"center\"\u003e\n\n# Superpoint Transformer\n\n[![python](https://img.shields.io/badge/-Python_3.8+-blue?logo=python\u0026logoColor=white)](https://github.com/pre-commit/pre-commit)\n[![pytorch](https://img.shields.io/badge/PyTorch_2.2+-ee4c2c?logo=pytorch\u0026logoColor=white)](https://pytorch.org/get-started/locally/)\n[![lightning](https://img.shields.io/badge/-Lightning_2.2+-792ee5?logo=pytorchlightning\u0026logoColor=white)](https://pytorchlightning.ai/)\n[![hydra](https://img.shields.io/badge/Config-Hydra_1.3-89b8cd)](https://hydra.cc/)\n[![license](https://img.shields.io/badge/License-MIT-green.svg?labelColor=gray)](https://github.com/ashleve/lightning-hydra-template#license)\n\n[//]: # ([![Paper]\u0026#40;https://img.shields.io/badge/paper-arxiv.1001.2234-B31B1B.svg\u0026#41;]\u0026#40;https://www.nature.com/articles/nature14539\u0026#41;)\n[//]: # ([![Conference]\u0026#40;https://img.shields.io/badge/AnyConference-year-4b44ce.svg\u0026#41;]\u0026#40;https://papers.nips.cc/paper/2020\u0026#41;)\n\n\nOfficial implementation for\n\u003cbr\u003e\n\u003cbr\u003e\n[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https://arxiv.org/abs/2306.08045) (ICCV 2023)\n\u003cbr\u003e\n[![arXiv](https://img.shields.io/badge/arxiv-2306.08045-b31b1b.svg)](https://arxiv.org/abs/2306.08045)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8042712.svg)](https://doi.org/10.5281/zenodo.8042712)\n[![Project page](https://img.shields.io/badge/Project_page-8A2BE2)](https://drprojects.github.io/superpoint-transformer)\n[![Tutorial](https://img.shields.io/badge/Tutorial-FFC300)](https://www.youtube.com/watch?v=2qKhpQs9gJw)\n\u003cbr\u003e\n\u003cbr\u003e\n[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https://arxiv.org/abs/2401.06704) (3DV 2024 Oral)\n\u003cbr\u003e\n[![arXiv](https://img.shields.io/badge/arxiv-2401.06704-b31b1b.svg)](https://arxiv.org/abs/2401.06704)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10689037.svg)](https://doi.org/10.5281/zenodo.10689037)\n[![Project page](https://img.shields.io/badge/Project_page-8A2BE2)](https://drprojects.github.io/supercluster)\n\u003cbr\u003e\n\u003cbr\u003e\n**If you ❤️ or simply use this project, don't forget to give the repository a ⭐,\nit means a lot to us !**\n\u003cbr\u003e\n\u003c/div\u003e\n\n```\n@article{robert2023spt,\n  title={Efficient 3D Semantic Segmentation with Superpoint Transformer},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},\n  year={2023}\n}\n```\n```\n@article{robert2024scalable,\n  title={Scalable 3D Panoptic Segmentation as Superpoint Graph Clustering},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE International Conference on 3D Vision},\n  year={2024}\n}\n```\n\n\u003cbr\u003e\n\n## 📌  Description\n\n### Superpoint Transformer\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"80%\" src=\"./media/teaser_spt.png\"\u003e\n\u003c/p\u003e\n\n**Superpoint Transformer (SPT)** is a superpoint-based transformer 🤖 architecture that efficiently ⚡ \nperforms **semantic segmentation** on large-scale 3D scenes. This method includes a \nfast algorithm that partitions 🧩 point clouds into a hierarchical superpoint \nstructure, as well as a self-attention mechanism to exploit the relationships \nbetween superpoints at multiple scales. \n\n\u003cdiv align=\"center\"\u003e\n\n|                                                                                   ✨ SPT in numbers ✨                                                                                      |\n|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|\n|                                                                          📊 **S3DIS 6-Fold** (76.0 mIoU)                                                                          |\n|                                                                         📊 **KITTI-360 Val** (63.5 mIoU)                                                                          |\n|                                                                           📊 **DALES** (79.6 mIoU)                                                                           | \n|      🦋 **212k parameters** ([PointNeXt](https://github.com/guochengqian/PointNeXt) ÷ 200, [Stratified Transformer](https://github.com/dvlab-research/Stratified-Transformer) ÷ 40)       | \n| ⚡ S3DIS training in **3h on 1 GPU** ([PointNeXt](https://github.com/guochengqian/PointNeXt) ÷ 7, [Stratified Transformer](https://github.com/dvlab-research/Stratified-Transformer) ÷ 70) | \n|                                                  ⚡ **Preprocessing x7 faster than [SPG](https://github.com/loicland/superpoint_graph)**                                                   |\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-3d-semantic-segmentation-with-1/3d-semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-3d-semantic-segmentation-with-1/3d-semantic-segmentation-on-dales)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-dales?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-3d-semantic-segmentation-with-1/semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis?p=efficient-3d-semantic-segmentation-with-1)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-3d-semantic-segmentation-with-1/3d-semantic-segmentation-on-kitti-360)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-kitti-360?p=efficient-3d-semantic-segmentation-with-1)\n\u003c/div\u003e\n\n### SuperCluster\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"80%\" src=\"./media/teaser_supercluster.png\"\u003e\n\u003c/p\u003e\n\n**SuperCluster** is a superpoint-based architecture for **panoptic segmentation** of (very) large 3D scenes 🐘 based on SPT. \nWe formulate the panoptic segmentation task as a **scalable superpoint graph clustering** task. \nTo this end, our model is trained to predict the input parameters of a graph optimization problem whose solution is a panoptic segmentation 💡.\nThis formulation allows supervising our model with per-node and per-edge objectives only, circumventing the need for computing an actual panoptic segmentation and associated matching issues at train time.\nAt inference time, our fast parallelized algorithm solves the small graph optimization problem, yielding object instances 👥.\nDue to its lightweight backbone and scalable formulation, SuperCluster can process scenes of unprecedented scale at once, on a single GPU 🚀, with fewer than 1M parameters 🦋.\n\n\u003cdiv align=\"center\"\u003e\n\n|                               ✨ SuperCluster in numbers ✨                                |\n|:----------------------------------------------------------------------------------------:|\n|                              📊 **S3DIS 6-Fold** (55.9 PQ)                               |\n|                              📊 **S3DIS Area 5** (50.1 PQ)                               |\n|                               📊 **ScanNet Val** (58.7 PQ)                               |\n|                              📊 **KITTI-360 Val** (48.3 PQ)                              |\n|                                  📊 **DALES** (61.2 PQ)                                  |\n| 🦋 **212k parameters** ([PointGroup](https://github.com/dvlab-research/PointGroup) ÷ 37) |\n|                           ⚡ S3DIS training in **4h on 1 GPU**                            | \n|              ⚡ **7.8km²** tile of **18M** points in **10.1s** on **1 GPU**               |\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scalable-3d-panoptic-segmentation-with/panoptic-segmentation-on-s3dis)](https://paperswithcode.com/sota/panoptic-segmentation-on-s3dis?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scalable-3d-panoptic-segmentation-with/panoptic-segmentation-on-s3dis-area5)](https://paperswithcode.com/sota/panoptic-segmentation-on-s3dis-area5?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scalable-3d-panoptic-segmentation-with/panoptic-segmentation-on-scannetv2)](https://paperswithcode.com/sota/panoptic-segmentation-on-scannetv2?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scalable-3d-panoptic-segmentation-with/panoptic-segmentation-on-kitti-360)](https://paperswithcode.com/sota/panoptic-segmentation-on-kitti-360?p=scalable-3d-panoptic-segmentation-with)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scalable-3d-panoptic-segmentation-with/panoptic-segmentation-on-dales)](https://paperswithcode.com/sota/panoptic-segmentation-on-dales?p=scalable-3d-panoptic-segmentation-with)\n\n\u003c/div\u003e\n\n\u003cbr\u003e\n\n## 📰  Updates\n- **27.06.2024** Released our Superpoint Transformer 🧑‍🏫 tutorial \n[slides](media/superpoint_transformer_tutorial.pdf), \n[notebook](notebooks/superpoint_transformer_tutorial.ipynb), and [video](https://www.youtube.com/watch?v=2qKhpQs9gJw). \nCheck these out if you are getting started with the project ! \n- **21.06.2024** [Damien](https://github.com/drprojects) will be giving a \n**🧑‍🏫 tutorial on Superpoint Transformer on 📅 27.06.2024 at 1pm CEST**. \nMake sure to come if you want to gain some hands-on experience with the project !\n**[Registration here](https://www.linkedin.com/events/superpointtransformersfor3dpoin7209130538110963712)**. \n- **28.02.2024** Major code release for **panoptic segmentation**, implementing \n**[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https://arxiv.org/abs/2401.06704)**.\nThis new version also implements long-awaited features such as lightning's\n`predict()` behavior, **voxel-resolution and full-resolution prediction**.\nSome changes in the dependencies and repository structure are **not \nbackward-compatible**. If you were already using anterior code versions, this means we recommend re-installing your conda environment and re-running the preprocessing or your datasets❗\n- **15.10.2023** Our paper **[_Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering_](https://arxiv.org/abs/2401.06704)** was accepted for an **oral** presentation at **[3DV 2024](https://3dvconf.github.io/2024/)** 🥳\n- **06.10.2023** Come see our poster for **[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https://arxiv.org/abs/2306.08045)** at **[ICCV 2023](https://iccv2023.thecvf.com/)**\n- **14.07.2023** Our paper **[_Efficient 3D Semantic Segmentation with Superpoint Transformer_](https://arxiv.org/abs/2306.08045)** was accepted at **[ICCV 2023](https://iccv2023.thecvf.com/)** 🥳\n- **15.06.2023** Official release 🌱\n\n\u003cbr\u003e\n\n## 💻  Environment requirements\nThis project was tested with:\n- Linux OS\n- **64G** RAM\n- NVIDIA GTX 1080 Ti **11G**, NVIDIA V100 **32G**, NVIDIA A40 **48G**\n- CUDA 11.8 and 12.1\n- conda 23.3.1\n\n\u003cbr\u003e\n\n## 🏗  Installation\nSimply run [`install.sh`](install.sh) to install all dependencies in a new conda environment \nnamed `spt`. \n```bash\n# Creates a conda env named 'spt' env and installs dependencies\n./install.sh\n```\n\n\u003e **Note**: See the [Datasets page](docs/datasets.md) for setting up your dataset\n\u003e path and file structure.\n\n\u003cbr\u003e\n\n### 🔩  Project structure\n```\n└── superpoint_transformer\n    │\n    ├── configs                   # Hydra configs\n    │   ├── callbacks                 # Callbacks configs\n    │   ├── data                      # Data configs\n    │   ├── debug                     # Debugging configs\n    │   ├── experiment                # Experiment configs\n    │   ├── extras                    # Extra utilities configs\n    │   ├── hparams_search            # Hyperparameter search configs\n    │   ├── hydra                     # Hydra configs\n    │   ├── local                     # Local configs\n    │   ├── logger                    # Logger configs\n    │   ├── model                     # Model configs\n    │   ├── paths                     # Project paths configs\n    │   ├── trainer                   # Trainer configs\n    │   │\n    │   ├── eval.yaml                 # Main config for evaluation\n    │   └── train.yaml                # Main config for training\n    │\n    ├── data                      # Project data (see docs/datasets.md)\n    │\n    ├── docs                      # Documentation\n    │\n    ├── logs                      # Logs generated by hydra and lightning loggers\n    │\n    ├── media                     # Media illustrating the project\n    │\n    ├── notebooks                 # Jupyter notebooks\n    │\n    ├── scripts                   # Shell scripts\n    │\n    ├── src                       # Source code\n    │   ├── data                      # Data structure for hierarchical partitions\n    │   ├── datamodules               # Lightning DataModules\n    │   ├── datasets                  # Datasets\n    │   ├── dependencies              # Compiled dependencies\n    │   ├── loader                    # DataLoader\n    │   ├── loss                      # Loss\n    │   ├── metrics                   # Metrics\n    │   ├── models                    # Model architecture\n    │   ├── nn                        # Model building blocks\n    │   ├── optim                     # Optimization \n    │   ├── transforms                # Functions for transforms, pre-transforms, etc\n    │   ├── utils                     # Utilities\n    │   ├── visualization             # Interactive visualization tool\n    │   │\n    │   ├── eval.py                   # Run evaluation\n    │   └── train.py                  # Run training\n    │\n    ├── tests                     # Tests of any kind\n    │\n    ├── .env.example              # Example of file for storing private environment variables\n    ├── .gitignore                # List of files ignored by git\n    ├── .pre-commit-config.yaml   # Configuration of pre-commit hooks for code formatting\n    ├── install.sh                # Installation script\n    ├── LICENSE                   # Project license\n    └── README.md\n\n```\n\n\u003e **Note**: See the [Datasets page](docs/datasets.md) for further details on `data/`. \n\n\u003e **Note**: See the [Logs page](docs/logging.md) for further details on `logs/`. \n\n\u003cbr\u003e\n\n## 🚀  Usage\n### Datasets\nSee the [Datasets page](docs/datasets.md) to set up your datasets. \n\n### Evaluation\nUse the following command structure for evaluating our models from a checkpoint \nfile `checkpoint.ckpt`, where `\u003ctask\u003e` should be `semantic` for using SPT and `panoptic` for using \nSuperCluster:\n\n```bash\n# Evaluate for \u003ctask\u003e segmentation on \u003cdataset\u003e\npython src/eval.py experiment=\u003ctask\u003e/\u003cdataset\u003e ckpt_path=/path/to/your/checkpoint.ckpt\n```\n\nSome examples:\n\n```bash\n# Evaluate SPT on S3DIS Fold 5\npython src/eval.py experiment=semantic/s3dis datamodule.fold=5 ckpt_path=/path/to/your/checkpoint.ckpt\n\n# Evaluate SPT on KITTI-360 Val\npython src/eval.py experiment=semantic/kitti360  ckpt_path=/path/to/your/checkpoint.ckpt \n\n# Evaluate SPT on DALES\npython src/eval.py experiment=semantic/dales ckpt_path=/path/to/your/checkpoint.ckpt\n\n# Evaluate SuperCluster on S3DIS Fold 5\npython src/eval.py experiment=panoptic/s3dis datamodule.fold=5 ckpt_path=/path/to/your/checkpoint.ckpt\n\n# Evaluate SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src/eval.py experiment=panoptic/s3dis_with_stuff datamodule.fold=5 ckpt_path=/path/to/your/checkpoint.ckpt\n\n# Evaluate SuperCluster on ScanNet Val\npython src/eval.py experiment=panoptic/scannet ckpt_path=/path/to/your/checkpoint.ckpt\n\n# Evaluate SuperCluster on KITTI-360 Val\npython src/eval.py experiment=panoptic/kitti360  ckpt_path=/path/to/your/checkpoint.ckpt \n\n# Evaluate SuperCluster on DALES\npython src/eval.py experiment=panoptic/dales ckpt_path=/path/to/your/checkpoint.ckpt\n```\n\n\u003e **Note**: \n\u003e \n\u003e The pretrained weights of the **SPT** and **SPT-nano** models for \n\u003e**S3DIS 6-Fold**, **KITTI-360 Val**, and **DALES** are available at:\n\u003e\n\u003e [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8042712.svg)](https://doi.org/10.5281/zenodo.8042712)\n\u003e \n\u003e The pretrained weights of the **SuperCluster** models for \n\u003e**S3DIS 6-Fold**, **S3DIS 6-Fold with stuff**, **ScanNet Val**, **KITTI-360 Val**, and **DALES** are available at:\n\u003e\n\u003e [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10689037.svg)](https://doi.org/10.5281/zenodo.10689037)\n\n### Training\nUse the following command structure for **train our models on a 32G-GPU**, \nwhere `\u003ctask\u003e` should be `semantic` for using SPT and `panoptic` for using \nSuperCluster:\n\n```bash\n# Train for \u003ctask\u003e segmentation on \u003cdataset\u003e\npython src/train.py experiment=\u003ctask\u003e/\u003cdataset\u003e\n```\n\nSome examples:\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src/train.py experiment=semantic/s3dis datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src/train.py experiment=semantic/kitti360 \n\n# Train SPT on DALES\npython src/train.py experiment=semantic/dales\n\n# Train SuperCluster on S3DIS Fold 5\npython src/train.py experiment=panoptic/s3dis datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src/train.py experiment=panoptic/s3dis_with_stuff datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src/train.py experiment=panoptic/scannet\n\n# Train SuperCluster on KITTI-360 Val\npython src/train.py experiment=panoptic/kitti360 \n\n# Train SuperCluster on DALES\npython src/train.py experiment=panoptic/dales\n```\n\nUse the following to **train on a 11G-GPU 💾** (training time and performance \nmay vary):\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src/train.py experiment=semantic/s3dis_11g datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src/train.py experiment=semantic/kitti360_11g \n\n# Train SPT on DALES\npython src/train.py experiment=semantic/dales_11g\n\n# Train SuperCluster on S3DIS Fold 5\npython src/train.py experiment=panoptic/s3dis_11g datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src/train.py experiment=panoptic/s3dis_with_stuff_11g datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src/train.py experiment=panoptic/scannet_11g\n\n# Train SuperCluster on KITTI-360 Val\npython src/train.py experiment=panoptic/kitti360_11g \n\n# Train SuperCluster on DALES\npython src/train.py experiment=panoptic/dales_11g\n```\n\n\u003e **Note**: Encountering CUDA Out-Of-Memory errors 💀💾 ? See our dedicated \n\u003e [troubleshooting section](#cuda-out-of-memory-errors).\n\n\u003e **Note**: Other ready-to-use configs are provided in\n\u003e[`configs/experiment/`](configs/experiment). You can easily design your own \n\u003eexperiments by composing [configs](configs):\n\u003e```bash\n\u003e# Train Nano-3 for 50 epochs on DALES\n\u003epython src/train.py datamodule=dales model=nano-3 trainer.max_epochs=50\n\u003e```\n\u003eSee \n\u003e[Lightning-Hydra](https://github.com/ashleve/lightning-hydra-template) for more\n\u003einformation on how the config system works and all the awesome perks of the \n\u003e Lightning+Hydra combo.\n\n\u003e **Note**: By default, your logs will automatically be uploaded to \n\u003e[Weights and Biases](https://wandb.ai), from where you can track and compare \n\u003eyour experiments. Other loggers are available in \n\u003e[`configs/logger/`](configs/logger). See \n\u003e[Lightning-Hydra](https://github.com/ashleve/lightning-hydra-template) for more\n\u003einformation on the logging options.\n\n### PyTorch Lightning `predict()`\nBoth SPT and SuperCluster inherit from `LightningModule` and implement `predict_step()`, which permits using \n[PyTorch Lightning's `Trainer.predict()` mechanism](https://lightning.ai/docs/pytorch/stable/deploy/production_basic.html).\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# Predict behavior for semantic segmentation from a torch DataLoader\ndataloader = DataLoader(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, dataloaders=dataloader)\n```\n\nThis, however, still requires you to instantiate a `Trainer`, a `DataLoader`, \nand a model with relevant parameters.\n\nFor a little more simplicity, all our datasets inherit from \n`LightningDataModule` and implement `predict_dataloader()` by pointing to their \ncorresponding test set by default. This permits directly passing a datamodule to\n[PyTorch Lightning's `Trainer.predict()`](https://lightning.ai/docs/pytorch/stable/common/trainer.html#predict)\nwithout explicitly instantiating a `DataLoader`.\n\n```python\nfrom src.models.semantic import SemanticSegmentationModule\nfrom src.datamodules.s3dis import S3DISDataModule\nfrom pytorch_lightning import Trainer\n\n# Predict behavior for semantic segmentation on S3DIS\ndatamodule = S3DISDataModule(...)\nmodel = SemanticSegmentationModule(...)\ntrainer = Trainer(...)\nbatch, output = trainer.predict(model=model, datamodule=datamodule)\n```\n\nFor more details on how to instantiate these, as well as the output format\nof our model, we strongly encourage you to play with our \n[demo notebook](notebooks/demo.ipynb) and have a look at the [`src/eval.py`](src/eval.py) script.\n\n### Full-resolution predictions\nBy design, our models only need to produce predictions for the superpoints of \nthe $P_1$ partition level during training. \nAll our losses and metrics are formulated as superpoint-wise objectives. \nThis conveniently saves compute and memory at training and evaluation time.\n\nAt inference time, however, we often need the **predictions on the voxels** of the\n$P_0$ partition level or on the **full-resolution input point cloud**.\nTo this end, we provide helper functions to recover voxel-wise and full-resolution\npredictions.\n\nSee our [demo notebook](notebooks/demo.ipynb) for more details on these.\n\n### Using a pretrained model on custom data\nFor running a pretrained model on your own point cloud, please refer to our \ntutorial [slides](media/superpoint_transformer_tutorial.pdf), \n[notebook](notebooks/superpoint_transformer_tutorial.ipynb), \nand [video](https://www.youtube.com/watch?v=2qKhpQs9gJw).\n\n### Parametrizing the superpoint partition on custom data\nOur hierarchical superpoint partition is computed at preprocessing time. Its\nconstruction involves several steps whose parametrization must be adapted to\nyour specific dataset and task. Please refer to our \ntutorial [slides](media/superpoint_transformer_tutorial.pdf), \n[notebook](notebooks/superpoint_transformer_tutorial.ipynb), \nand [video](https://www.youtube.com/watch?v=2qKhpQs9gJw) for better \nunderstanding this process and tuning it to your needs.\n\n### Parameterizing SuperCluster graph clustering\nOne specificity of SuperCluster is that the model is not trained to explicitly \ndo panoptic segmentation, but to predict the input parameters of a superpoint \ngraph clustering problem whose solution is a panoptic segmentation.\n\nFor this reason, the hyperparameters for this graph optimization problem are \nselected after training, with a grid search on the training or validation set.\nWe find that fairly similar hyperparameters yield the best performance on all \nour datasets (see our [paper](https://arxiv.org/abs/2401.06704)'s appendix). Yet, you may want to explore \nthese hyperparameters for your own dataset. To this end, see our \n[demo notebook](notebooks/demo_panoptic_parametrization.ipynb) for \nparameterizing the panoptic segmentation.\n\n### Notebooks \u0026 visualization\nWe provide [notebooks](notebooks) to help you get started with manipulating our \ncore data structures, configs loading, dataset and model instantiation, \ninference on each dataset, and visualization.\n\nIn particular, we created an interactive visualization tool ✨ which can be used\nto produce shareable HTMLs. Demos of how to use this tool are provided in \nthe [notebooks](notebooks). Additionally, examples of such HTML files are \nprovided in [media/visualizations.7z](media/visualizations.7z)\n\n\u003cbr\u003e\n\n## 📚  Documentation\n\n| Location                                          | Content                                                                                                                     |\n|:--------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------|\n| [README](README.md)                               | General introduction to the project                                                                                         |\n| [`docs/data_structures`](docs/data_structures.md) | Introduction to the core data structures of this project: `Data`, `NAG`, `Cluster`, and `InstanceData`                      |\n| [`docs/datasets`](docs/datasets.md)               | Introduction to our implemented datasets, to our `BaseDataset` class, and how to create your own dataset inheriting from it |\n| [`docs/logging`](docs/logging.md)                 | Introduction to logging and the project's `logs/` structure                                                                 |\n| [`docs/visualization`](docs/visualization.md)     | Introduction to our interactive 3D visualization tool                                                                       |\n\n\u003e **Note**: We endeavoured to **comment our code** as much as possible to make \n\u003e this project usable. If you don't find the answer you are looking for in the \n\u003e `docs/`, make sure to **have a look at the source code and past issues**. \n\u003e Still, if you find some parts are unclear or some more documentation would be \n\u003e needed, feel free to let us know by creating an issue ! \n\n\u003cbr\u003e\n\n## 👩‍🔧  Troubleshooting\nHere are some common issues and tips for tackling them.\n\n### SPT or SuperCluster on an 11G-GPU \nOur default configurations are designed for a 32G-GPU. Yet, SPT and SuperCluster can run \non an **11G-GPU 💾**, with minor time and performance variations.\n\nWe provide configs in [`configs/experiment/semantic`](configs/experiment/semantic) for \ntraining SPT on an **11G-GPU 💾**:\n\n```bash\n# Train SPT on S3DIS Fold 5\npython src/train.py experiment=semantic/s3dis_11g datamodule.fold=5\n\n# Train SPT on KITTI-360 Val\npython src/train.py experiment=semantic/kitti360_11g \n\n# Train SPT on DALES\npython src/train.py experiment=semantic/dales_11g\n```\n\nSimilarly, we provide configs in [`configs/experiment/panoptic`](configs/experiment/panoptic) for \ntraining SuperCluster on an **11G-GPU 💾**:\n\n```bash\n# Train SuperCluster on S3DIS Fold 5\npython src/train.py experiment=panoptic/s3dis_11g datamodule.fold=5\n\n# Train SuperCluster on S3DIS Fold 5 with {wall, floor, ceiling} as 'stuff'\npython src/train.py experiment=panoptic/s3dis_with_stuff_11g datamodule.fold=5\n\n# Train SuperCluster on ScanNet Val\npython src/train.py experiment=panoptic/scannet_11g\n\n# Train SuperCluster on KITTI-360 Val\npython src/train.py experiment=panoptic/kitti360_11g \n\n# Train SuperCluster on DALES\npython src/train.py experiment=panoptic/dales_11g\n```\n\n\n### CUDA Out-Of-Memory Errors\nHaving some CUDA OOM errors 💀💾 ? Here are some parameters you can play \nwith to mitigate GPU memory use, based on when the error occurs.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eParameters affecting CUDA memory.\u003c/b\u003e\u003c/summary\u003e\n\n**Legend**: 🟡 Preprocessing | 🔴 Training | 🟣 Inference (including validation and testing during training)\n\n| Parameter                                   | Description                                                                                                                                                                                                                        |  When  |\n|:--------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|\n| `datamodule.xy_tiling`                      | Splits dataset tiles into xy_tiling^2 smaller tiles, based on a regular XY grid. Ideal square-shaped tiles à la DALES. Note this will affect the number of training steps.                                                         |  🟡🟣  |\n| `datamodule.pc_tiling`                      | Splits dataset tiles into 2^pc_tiling smaller tiles, based on a their principal component. Ideal for varying tile shapes à la S3DIS and KITTI-360. Note this will affect the number of training steps.                             |  🟡🟣  |\n| `datamodule.max_num_nodes`                  | Limits the number of $P_1$ partition nodes/superpoints in the **training batches**.                                                                                                                                                |   🔴   |\n| `datamodule.max_num_edges`                  | Limits the number of $P_1$ partition edges in the **training batches**.                                                                                                                                                            |   🔴   |\n| `datamodule.voxel`                          | Increasing voxel size will reduce preprocessing, training and inference times but will reduce performance.                                                                                                                         | 🟡🔴🟣 |\n| `datamodule.pcp_regularization`             | Regularization for partition levels. The larger, the fewer the superpoints.                                                                                                                                                        | 🟡🔴🟣 |\n| `datamodule.pcp_spatial_weight`             | Importance of the 3D position in the partition. The smaller, the fewer the superpoints.                                                                                                                                            | 🟡🔴🟣 |\n| `datamodule.pcp_cutoff`                     | Minimum superpoint size. The larger, the fewer the superpoints.                                                                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_k_max`                    | Maximum number of adjacent nodes in the superpoint graphs. The smaller, the fewer the superedges.                                                                                                                                  | 🟡🔴🟣 |\n| `datamodule.graph_gap`                      | Maximum distance between adjacent superpoints int the superpoint graphs. The smaller, the fewer the superedges.                                                                                                                    | 🟡🔴🟣 |\n| `datamodule.graph_chunk`                    | Reduce to avoid OOM when `RadiusHorizontalGraph` preprocesses the superpoint graph.                                                                                                                                                |   🟡   |\n| `datamodule.dataloader.batch_size`          | Controls the number of loaded tiles. Each **train batch** is composed of `batch_size`*`datamodule.sample_graph_k` spherical samplings. Inference is performed on **entire validation and test tiles**, without spherical sampling. |  🔴🟣  |\n| `datamodule.sample_segment_ratio`           | Randomly drops a fraction of the superpoints at each partition level.                                                                                                                                                              |   🔴   |\n| `datamodule.sample_graph_k`                 | Controls the number of spherical samples in the **train batches**.                                                                                                                                                                 |   🔴   |\n| `datamodule.sample_graph_r`                 | Controls the radius of spherical samples in the **train batches**. Set to `sample_graph_r\u003c=0` to use the entire tile without spherical sampling.                                                                                   |   🔴   |\n| `datamodule.sample_point_min`               | Controls the minimum number of $P_0$ points sampled per superpoint in the **train batches**.                                                                                                                                       |   🔴   |\n| `datamodule.sample_point_max`               | Controls the maximum number of $P_0$ points sampled per superpoint in the **train batches**.                                                                                                                                       |   🔴   |\n| `callbacks.gradient_accumulator.scheduling` | Gradient accumulation. Can be used to train with smaller batches, with more training steps.                                                                                                                                        |   🔴   |\n\n\u003cbr\u003e\n\u003c/details\u003e\n\n\u003cbr\u003e\n\n## 💳  Credits\n- This project was built using [Lightning-Hydra template](https://github.com/ashleve/lightning-hydra-template).\n- The main data structures of this work rely on [PyTorch Geometric](https://github.com/pyg-team/pytorch_geometric)\n- Some point cloud operations were inspired from the [Torch-Points3D framework](https://github.com/nicolas-chaulet/torch-points3d), although not merged with the official project at this point. \n- For the KITTI-360 dataset, some code from the official [KITTI-360](https://github.com/autonomousvision/kitti360Scripts) was used.\n- Some superpoint-graph-related operations were inspired from [Superpoint Graph](https://github.com/loicland/superpoint_graph)\n- The hierarchical superpoint partition and graph clustering are computed using [Parallel Cut-Pursuit](https://gitlab.com/1a7r0ch3/parallel-cut-pursuit)\n\n\u003cbr\u003e\n\n## Citing our work\nIf your work uses all or part of the present code, please include the following a citation:\n\n```\n@article{robert2023spt,\n  title={Efficient 3D Semantic Segmentation with Superpoint Transformer},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},\n  year={2023}\n}\n\n@article{robert2024scalable,\n  title={Scalable 3D Panoptic Segmentation as Superpoint Graph Clustering},\n  author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},\n  journal={Proceedings of the IEEE International Conference on 3D Vision},\n  year={2024}\n}\n```\n\nYou can find our [SPT paper 📄](https://arxiv.org/abs/2306.08045) and [SuperCluster paper 📄](https://arxiv.org/abs/2401.06704) on arxiv.\n\nAlso, **if you ❤️ or simply use this project, don't forget to give the \nrepository a ⭐, it means a lot to us !**\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrprojects%2Fsuperpoint_transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdrprojects%2Fsuperpoint_transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrprojects%2Fsuperpoint_transformer/lists"}