{"id":27064859,"url":"https://github.com/3dlg-hcvc/minsu3d","last_synced_at":"2025-10-04T06:44:34.961Z","repository":{"id":60492741,"uuid":"378807241","full_name":"3dlg-hcvc/minsu3d","owner":"3dlg-hcvc","description":"MINSU3D: MinkowskiEngine-powered Scene Understanding in 3D","archived":false,"fork":false,"pushed_at":"2024-06-24T18:28:16.000Z","size":21276,"stargazers_count":32,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-06-24T20:22:42.500Z","etag":null,"topics":["3d","computer-vision","deep-learning","instance-segmentation","minkowski-engine","object-detection","pytorch","pytorch-lightning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/3dlg-hcvc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-21T04:35:53.000Z","updated_at":"2024-06-24T18:28:20.000Z","dependencies_parsed_at":"2024-06-24T20:14:56.448Z","dependency_job_id":"e642592c-cd2c-4721-84d7-49c2a8a03c31","html_url":"https://github.com/3dlg-hcvc/minsu3d","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/3dlg-hcvc%2Fminsu3d","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/3dlg-hcvc%2Fminsu3d/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/3dlg-hcvc%2Fminsu3d/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/3dlg-hcvc%2Fminsu3d/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/3dlg-hcvc","download_url":"https://codeload.github.com/3dlg-hcvc/minsu3d/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247370200,"owners_count":20927970,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d","computer-vision","deep-learning","instance-segmentation","minkowski-engine","object-detection","pytorch","pytorch-lightning"],"created_at":"2025-04-05T17:19:26.870Z","updated_at":"2025-10-04T06:44:29.921Z","avatar_url":"https://github.com/3dlg-hcvc.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MINSU3D\nMINSU3D：**Min**kowskiEngine-powered **S**cene **U**nderstanding in **3D** contains reimplementation of state-of-the-art 3D scene understanding methods on point clouds powered by [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine).  \n\n\u003ca href=\"https://pytorch.org/\"\u003e\u003cimg alt=\"PyTorch\" src=\"https://img.shields.io/badge/PyTorch-EE4C2C?style=for-the-badge\u0026logo=pytorch\u0026logoColor=white\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pytorchlightning.ai/\"\u003e\u003cimg alt=\"Lightning\" src=\"https://img.shields.io/badge/Lightning-792DE4?style=for-the-badge\u0026logo=pytorch-lightning\u0026logoColor=white\"\u003e\u003c/a\u003e\n\u003ca href=\"https://wandb.ai/site\"\u003e\u003cimg alt=\"WandB\" src=\"https://img.shields.io/badge/Weights_\u0026_Biases-FFBE00?style=for-the-badge\u0026logo=WeightsAndBiases\u0026logoColor=white\"\u003e\u003c/a\u003e\n\nWe support the following instance segmentation methods:\n- [PointGroup](https://github.com/dvlab-research/PointGroup)\n- [HAIS](https://github.com/hustvl/HAIS)\n- [SoftGroup](https://github.com/thangvubk/SoftGroup)\n\nWe also provide bounding boxes predictions based on instance segmentation for 3D object detection.\n\n## Features\n- Highly-modularized design enables researchers to easily add different models and datasets.\n- Multi-GPU and distributed training support through [PytorchLightning](https://github.com/Lightning-AI/lightning).\n- Better logging with [W\u0026B](https://github.com/wandb/wandb), periodic evaluation during training.\n- Easy experiment configuration and management with [Hydra](https://github.com/facebookresearch/hydra).\n- Unified and optimized C++ and CUDA extensions.\n\n## Changelog\n1. MINSU3D v2.0 release, ~1.8 times faster, ~4GB less CPU memory usage and ~400MB less GPU memory usage\n\n## Setup\n\n### Conda (recommended)\nWe recommend the use of [miniconda](https://docs.conda.io/en/latest/miniconda.html) to manage system dependencies.\n\n```shell\n# create and activate the conda environment\nconda create -n minsu3d python=3.10\nconda activate minsu3d\n\n# install PyTorch 2.0\nconda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia\n\n# install Python libraries\npip install .\n\n# install OpenBLAS\nconda install openblas-devel --no-deps -c anaconda\n\n# install MinkowskiEngine\npip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps \\\n--install-option=\"--blas_include_dirs=${CONDA_PREFIX}/include\" --install-option=\"--blas=openblas\"\n\n# install C++ extensions\nexport CPATH=$CONDA_PREFIX/include:$CPATH\nexport LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH\ncd minsu3d/common_ops\npython setup.py develop\n```\n\n### Pip (without conda)\nNote: Setting up with pip (no conda) requires [OpenBLAS](https://github.com/xianyi/OpenBLAS) to be pre-installed in your system.\n\n```shell\n# create and activate the virtual environment\nvirtualenv --no-download env\nsource env/bin/activate\n\n# install PyTorch 2.0\npip3 install torch\n\n# install Python libraries\npip install .\n\n# install OpenBLAS and SparseHash via APT\nsudo apt install libopenblas-dev\n\n# install MinkowskiEngine\npip install MinkowskiEngine\n\n# install C++ extensions\ncd minsu3d/common_ops\npython setup.py develop\n```\n\n## Data Preparation\n\n### ScanNet v2 dataset\n1. Download the [ScanNet v2](http://www.scan-net.org/) dataset and put it under `minsu3d/data/scannetv2`. To acquire the access to the dataset, please refer to their [instructions](https://github.com/ScanNet/ScanNet#scannet-data). You will get a `download-scannet.py` script after your request is approved:\n```shell\n# about 10.7GB in total\npython download-scannet.py -o data/scannet --type _vh_clean_2.ply\npython download-scannet.py -o data/scannet --type _vh_clean.aggregation.json\npython download-scannet.py -o data/scannet --type _vh_clean_2.0.010000.segs.json\n```\n\nThe raw dataset files should be organized as follows:\n\n```shell\nminsu3d\n├── data\n│   ├── scannetv2\n│   │   ├── scans\n│   │   │   ├── [scene_id]\n│   │   │   │   ├── [scene_id]_vh_clean_2.ply\n│   │   │   │   ├── [scene_id]_vh_clean_2.0.010000.segs.json\n│   │   │   │   ├── [scene_id].aggregation.json\n│   │   │   │   ├── [scene_id].txt\n```\n\n2. Preprocess the data, it converts original meshes and annotations to `.pth` data:\n```shell\npython data/scannetv2/preprocess_all_data.py data=scannetv2\n```\n\n## Training, Inference and Evaluation\nNote: Configuration files are managed by [Hydra](https://hydra.cc/), you can easily add or override any configuration attributes by passing them as arguments.\n```shell\n# log in to WandB\nwandb login\n\n# train a model from scratch\n# available model_name: pointgroup, hais, softgroup\n# available dataset_name: scannetv2\npython train.py model={model_name} data={dataset_name} experiment_name={experiment_name}\n\n# train a model from scratch with 2 GPUs\npython train.py model={model_name} data={dataset_name} model.trainer.devices=2\n\n# train a model from a checkpoint\npython train.py model={model_name} data={dataset_name} model.ckpt_path={checkpoint_path}\n\n# test a pretrained model\npython test.py model={model_name} data={dataset_name} model.ckpt_path={pretrained_model_path}\n\n# evaluate inference results\npython eval.py model={model_name} data={dataset_name} experiment_name={experiment_name}\n\n# examples:\n# python train.py model=pointgroup data=scannetv2 model.trainer.max_epochs=480\n# python test.py model=pointgroup data=scannetv2 model.ckpt_path=PointGroup_best.ckpt\n# python eval.py model=hais data=scannetv2 experiment_name=run_1\n```\n\n## Pretrained Models\n\nWe provide pretrained models for ScanNet v2. The pretrained model, corresponding config file, and performance on ScanNet v2 val set are given below.  Note that all MINSU3D models are trained from scratch. After downloading a pretrained model, run `test.py` to do inference as described in the above section.\n\n### ScanNet v2 val set\n| Model      | Code | mean AP | AP 50% | AP 25% | Bbox AP 50% | Bbox AP 25% | Download |\n|:-----------|:--------|:--------|:-------|:-------|:------------|:------------|:---------|\n| MINSU3D PointGroup | [config](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/config/model/pointgroup.yaml) \\| [model](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/minsu3d/model/pointgroup.py) | 36.4 | 57.9 | 71.1 | 49.9 | 60.0 | [link](https://aspis.cmpt.sfu.ca/projects/minsu3d/pretrained_models/PointGroup_best.ckpt)|\n| [Official PointGroup](https://github.com/dvlab-research/PointGroup) | - | 35.2 | 57.1 | 71.4 | - | - | - |\n| MINSU3D HAIS | [config](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/config/model/hais.yaml) \\| [model](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/minsu3d/model/hais.py)  | 42.6 | 61.9 | 72.6 | 51.4 | 62.9 | [link](https://aspis.cmpt.sfu.ca/projects/minsu3d/pretrained_models/HAIS_best.ckpt) |\n| [Official HAIS (retrained)](https://github.com/hustvl/HAIS)  | - | 42.2 | 61.0   | 72.9 | - | - | - |\n| [Official HAIS](https://github.com/hustvl/HAIS)  | - | 44.1 | 64.4   | 75.7   | - | - | - |\n| MINSU3D SoftGroup | [config](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/config/model/softgroup.yaml) \\| [model](https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/minsu3d/model/softgroup.py)  | 42.3   | 65.1 | 77.8 | 55.8 | 69.3 | [link](https://aspis.cmpt.sfu.ca/projects/minsu3d/pretrained_models/SoftGroup_best.ckpt) |\n| [Official SoftGroup](https://github.com/thangvubk/SoftGroup\u003csup\u003e1\u003c/sup\u003e) | - | 46.0 | 67.6   | 78.9 | 59.4 | 71.6 | - |\n\n\u003csup\u003e1\u003c/sup\u003e The official pretrained SoftGroup model was trained with HAIS checkpoint as pretrained backbone.\n\n\u003csup\u003e2\u003c/sup\u003e The MINSU3D HAIS model's scores are 2-3 lower than the official pretrained HAIS's. To investigate, we retrained the official HAIS model using their code, the best scores we can get are 42.2 / 61.0 / 72.9 for mean AP / AP 50% / AP 25%, which match our MINSU3D HAIS model's scores.\n\n## Visualization\nWe provide scripts to visualize the predicted segmentations and bounding boxes. To use the visualization scripts, place the mesh (ply) file from the Scannet dataset as follows.\n\n```\nminsu3d\n├── data\n│   ├── scannetv2\n│   │   ├── scans\n│   │   │   ├── [scene_id]\n|   |   |   |   ├── [scene_id]_vh_clean_2.ply\n```\n\nTo visualize the predictions, use `visualize/scannet/generate_ply.py` to generate ply files with vertices colored according to the semantic or instance.\n```shell\ncd visualize/scannet\npython generate_prediction_ply.py --predict_dir {path to the predictions} --split {test/val/train} --bbox --mode {semantic/instance} --output_dir {output directory of ply files}\n\n# example:\n# python generate_prediction_ply.py --predict_dir ../../output/ScanNet/PointGroup/test/predictions/instance --split val --bbox --mode semantic --output_dir output_ply\n```\n\nThe `--mode` option allows you to specify the color mode.  \nIn the 'semantic' mode, objects with the same semantic prediction will have the same color.  \nIn the 'instance' mode, each independent object instance will have an unique color, allowing the user to check how well the model performs on instance segmentation.  \n\nThe `--bbox` option allows you to generate ply file that uses bounding box to specify the position of objects.\n\n| Semantic Segmentation(color)              | Instance Segmentation(color)           |\n|:-----------------------------------:|:-------------------------------:|\n| \u003cimg src=\"https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/visualize/example/color_semantic.png\" width=\"400\"/\u003e | \u003cimg src=\"https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/visualize/example/color_instance.png\" width=\"400\"/\u003e |\n\n| Semantic Segmentation(bbox)              | Instance Segmentation(bbox)           |\n|:-----------------------------------:|:-------------------------------:|\n| \u003cimg src=\"https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/visualize/example/bbox_semantic.png\" width=\"400\"/\u003e | \u003cimg src=\"https://github.com/3dlg-hcvc/minsu3d-internal/blob/main/visualize/example/bbox_instance.png\" width=\"400\"/\u003e |\n\nIf you find that many bounding boxes are overlapping, you can choose to do non maximum suppression during the inference phase. This can be achieved by adjusting `TEST_NMS_THRESH` in the config file\n\n## Performance\n\n**Test environment**\n- CPU: Intel Core i9-9900K @ 3.60GHz × 16\n- RAM: 64GB\n- GPU: NVIDIA GeForce RTX 2080 Ti 11GB\n- System: Ubuntu 22.04.2 LTS\n\n**Training time in total (train set only, without validation)**\n| Model      | Epochs | Batch Size | MINSU3D | Official Version |\n|:-----------|:--------|:--------|:--------|:-------|\n| [PointGroup](https://github.com/dvlab-research/PointGroup) | 450 | 4 | 28hr | 51hr |\n| [HAIS](https://github.com/hustvl/HAIS)| 450 | 4 | 38hr | 60hr |\n| [SoftGroup](https://github.com/thangvubk/SoftGroup) | 256 | 4 | (to be updated) | 30hr |\n\n\n**Inference time per scene (avg)**\n| Model      | MINSU3D | Official Version |\n|:-----------|:--------|:-------|\n| [PointGroup](https://github.com/dvlab-research/PointGroup) | (to be updated) | 176ms |\n| [HAIS](https://github.com/hustvl/HAIS)| (to be updated) | 165ms |\n| [SoftGroup](https://github.com/thangvubk/SoftGroup) | (to be updated) | 204ms |\n\n## Customization\nMINSU3D allows for easy additions of custom datasets and models. All code under `minsu3d/data/dataset` and `minsu3d/model` are automatically registered and managed by [Hydra](https://github.com/facebookresearch/hydra) using configuration files under `config/data` and `config/model`, respectively. \n\n### Implement your own dataset\n1. Add a new dataset config file (.yaml) at `config/data/{your_dataset}.yaml`.\n2. Add a new dataset processing code at `minsu3d/data/dataset/{your_dataset}.py`, it should inherit the `GeneralDataset()` class from `minsu3d/data/dataset/general_dataset.py`.\n\n### Implement your own model\n1. Add a new model config file (.yaml) at `config/model/{your_model}.yaml`.\n2. Add a new model code at `minsu3d/model/{your_model}.py`, it should inherit the `GeneralModel()` class from `minsu3d/model/general_model.py`.\n\n## Acknowledgement\nThis repo is built upon the [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine), [PointGroup](https://github.com/dvlab-research/PointGroup), [HAIS](https://github.com/hustvl/HAIS), and [SoftGroup](https://github.com/thangvubk/SoftGroup).  We train our models on [ScanNet](https://github.com/ScanNet/ScanNet). If you use this repo and the pretrained models, please cite the original papers.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F3dlg-hcvc%2Fminsu3d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F3dlg-hcvc%2Fminsu3d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F3dlg-hcvc%2Fminsu3d/lists"}