{"id":13443629,"url":"https://github.com/SamsungLabs/imvoxelnet","last_synced_at":"2025-03-20T17:30:45.985Z","repository":{"id":38338726,"uuid":"322816005","full_name":"SamsungLabs/imvoxelnet","owner":"SamsungLabs","description":"[WACV2022] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection","archived":false,"fork":false,"pushed_at":"2023-09-25T15:42:00.000Z","size":6016,"stargazers_count":278,"open_issues_count":11,"forks_count":29,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-10-28T07:39:30.396Z","etag":null,"topics":["3d-object-detection","imvoxelnet","kitti","mmdetection","nuscenes","object-detection","pytorch","scannet","sun-rgbd"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SamsungLabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-12-19T09:57:38.000Z","updated_at":"2024-10-23T20:19:14.000Z","dependencies_parsed_at":"2022-08-09T03:01:00.787Z","dependency_job_id":"f6173579-5e8f-40d7-9989-84acbd7de9c4","html_url":"https://github.com/SamsungLabs/imvoxelnet","commit_stats":null,"previous_names":["saic-vul/imvoxelnet"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamsungLabs%2Fimvoxelnet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamsungLabs%2Fimvoxelnet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamsungLabs%2Fimvoxelnet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamsungLabs%2Fimvoxelnet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SamsungLabs","download_url":"https://codeload.github.com/SamsungLabs/imvoxelnet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244660153,"owners_count":20489296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-object-detection","imvoxelnet","kitti","mmdetection","nuscenes","object-detection","pytorch","scannet","sun-rgbd"],"created_at":"2024-07-31T03:02:05.673Z","updated_at":"2025-03-20T17:30:44.338Z","avatar_url":"https://github.com/SamsungLabs.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/imvoxelnet-image-to-voxels-projection-for/monocular-3d-object-detection-on-sun-rgb-d)](https://paperswithcode.com/sota/monocular-3d-object-detection-on-sun-rgb-d?p=imvoxelnet-image-to-voxels-projection-for)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/imvoxelnet-image-to-voxels-projection-for/room-layout-estimation-on-sun-rgb-d)](https://paperswithcode.com/sota/room-layout-estimation-on-sun-rgb-d?p=imvoxelnet-image-to-voxels-projection-for)\n\n# ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection\n\n**News**:\n * :fire: August, 2022. `ImVoxelNet` for `SUN RGB-D` is [now](https://github.com/open-mmlab/mmdetection3d/pull/1738) [supported](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/imvoxelnet) in [mmdetection3d](https://github.com/open-mmlab/mmdetection3d).\n * :fire: October, 2021. Our paper is accepted at [WACV 2022](https://wacv2022.thecvf.com). We simplify 3d neck to make indoor models much faster and accurate. For example, this improves `ScanNet` `mAP` by more than 2%. Please find updated configs in [configs/imvoxelnet/*_fast.py](https://github.com/saic-vul/imvoxelnet/tree/master/configs/imvoxelnet) and [models](https://github.com/saic-vul/imvoxelnet/releases/tag/v1.2).\n * :fire: August, 2021. We adapt center sampling for indoor detection. For example, this improves `ScanNet` `mAP` by more than 5%. Please find updated configs in [configs/imvoxelnet/*_top27.py](https://github.com/saic-vul/imvoxelnet/tree/master/configs/imvoxelnet) and [models](https://github.com/saic-vul/imvoxelnet/releases/tag/v1.1).\n * :fire: July, 2021. We update `ScanNet` image preprocessing both [here](https://github.com/saic-vul/imvoxelnet/pull/21) and in [mmdetection3d](https://github.com/open-mmlab/mmdetection3d/pull/696).\n * :fire: June, 2021. `ImVoxelNet` for `KITTI` is now [supported](https://github.com/open-mmlab/mmdetection3d/tree/master/configs/imvoxelnet) in [mmdetection3d](https://github.com/open-mmlab/mmdetection3d).\n\nThis repository contains implementation of the monocular/multi-view 3D object detector ImVoxelNet, introduced in our paper:\n\n\u003e **ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection**\u003cbr\u003e\n\u003e [Danila Rukhovich](https://github.com/filaPro),\n\u003e [Anna Vorontsova](https://github.com/highrut),\n\u003e [Anton Konushin](https://scholar.google.com/citations?user=ZT_k-wMAAAAJ)\n\u003e \u003cbr\u003e\n\u003e Samsung Research\u003cbr\u003e\n\u003e https://arxiv.org/abs/2106.01178\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./resources/scheme.png\" alt=\"drawing\" width=\"90%\"/\u003e\u003c/p\u003e\n\n### Installation\nFor convenience, we provide a [Dockerfile](docker/Dockerfile). Alternatively, you can install all required packages manually.\n\nThis implementation is based on [mmdetection3d](https://github.com/open-mmlab/mmdetection3d) framework.\nPlease refer to the original installation guide [install.md](docs/install.md), replacing `open-mmlab/mmdetection3d` with `saic-vul/imvoxelnet`.\nAlso, [rotated_iou](https://github.com/lilanxiao/Rotated_IoU) should be installed with [these](https://github.com/saic-vul/imvoxelnet/blob/master/docker/Dockerfile#L31-L34) 4 commands.\n\nMost of the `ImVoxelNet`-related code locates in the following files: \n[detectors/imvoxelnet.py](mmdet3d/models/detectors/imvoxelnet.py),\n[necks/imvoxelnet.py](mmdet3d/models/necks/imvoxelnet.py),\n[dense_heads/imvoxel_head.py](mmdet3d/models/dense_heads/imvoxel_head.py),\n[pipelines/multi_view.py](mmdet3d/datasets/pipelines/multi_view.py).\n\n### Datasets\n\nWe support three benchmarks based on the **SUN RGB-D** dataset.\n * For the [VoteNet](https://github.com/facebookresearch/votenet) benchmark with 10 object categories, \n   you should follow the instructions in [sunrgbd](data/sunrgbd). \n * For the [PerspectiveNet](https://papers.nips.cc/paper/2019/hash/b87517992f7dce71b674976b280257d2-Abstract.html)\n   benchmark with 30 object categories, the same instructions can be applied; \n   you only need to set `dataset` argument to `sunrgbd_monocular` when running `create_data.py`.\n * The [Total3DUnderstanding](https://github.com/yinyunie/Total3DUnderstanding)\n   benchmark implies detecting objects of 37 categories along with camera pose and room layout estimation.\n   Download the preprocessed data as \n   [train.json](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/sunrgbd_total_infos_train.json) and \n   [val.json](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/sunrgbd_total_infos_val.json) \n   and put it to `./data/sunrgbd`. Then run:\n   ```shell\n   python tools/data_converter/sunrgbd_total.py\n   ```\n\nFor **ScanNet** please follow instructions in [scannet](data/scannet).\nFor **KITTI** and **nuScenes**, please follow instructions in [getting_started.md](docs/getting_started.md).\n\n### Getting Started\n\nPlease see [getting_started.md](docs/getting_started.md) for basic usage examples.\n\n**Training**\n\nTo start training, run [dist_train](tools/dist_train.sh) with `ImVoxelNet` [configs](configs/imvoxelnet):\n```shell\nbash tools/dist_train.sh configs/imvoxelnet/imvoxelnet_kitti.py 8\n```\n\n**Testing**\n\nTest pre-trained model using [dist_test](tools/dist_test.sh) with `ImVoxelNet` [configs](configs/imvoxelnet):\n```shell\nbash tools/dist_test.sh configs/imvoxelnet/imvoxelnet_kitti.py \\\n    work_dirs/imvoxelnet_kitti/latest.pth 8 --eval mAP\n```\n\n**Visualization**\n\nVisualizations can be created with [test](tools/test.py) script. \nFor better visualizations, you may set `score_thr` in configs to `0.15` or more:\n```shell\npython tools/test.py configs/imvoxelnet/imvoxelnet_kitti.py \\\n    work_dirs/imvoxelnet_kitti/latest.pth --show \\\n    --show-dir work_dirs/imvoxelnet_kitti\n```\n\n### Models\n\n`v2` adds center sampling for indoor scenario. `v3` simplifies 3d neck for indoor scenario. Differences are discussed in [v2](https://arxiv.org/abs/2106.01178v2) and [v3](https://arxiv.org/abs/2106.01178v3) preprints.\n\n| Dataset   | Object Classes | Version | Download |\n|:---------:|:--------------:|:-------:|:--------:|\n| SUN RGB-D | 37 from \u003cbr\u003e Total3dUnderstanding | v1 \u0026#124; mAP@0.15: 41.5 \u003cbr\u003e v2 \u0026#124; mAP@0.15: 42.7 \u003cbr\u003e v3 \u0026#124; mAP@0.15: 43.7 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210525_091810.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210525_091810_atlas_total_sunrgbd.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_total_sunrgbd.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210808_005013.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210808_005013_imvoxelnet_total_sunrgbd_top27.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_total_sunrgbd_top27.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105247.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105247_imvoxelnet_total_sunrgbd_fast.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_total_sunrgbd_fast.py)|\n| SUN RGB-D | 30 from \u003cbr\u003e PerspectiveNet | v1 \u0026#124; mAP@0.15: 44.9 \u003cbr\u003e v2 \u0026#124;  mAP@0.15: 47.2 \u003cbr\u003e v3 \u0026#124; mAP@0.15: 48.7 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210526_072029.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210526_072029_atlas_perspective_sunrgbd.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_perspective_sunrgbd.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210809_114832.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210809_114832_imvoxelnet_perspective_sunrgbd_top27.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_perspective_sunrgbd_top27.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105254.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105254_imvoxelnet_perspective_sunrgbd_fast.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_perspective_sunrgbd_fast.py)|\n| SUN RGB-D | 10 from VoteNet | v1 \u0026#124; mAP@0.25: 38.8 \u003cbr\u003e v2 \u0026#124;  mAP@0.25: 39.4 \u003cbr\u003e v3 \u0026#124; mAP@0.25: 40.7 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210428_124351.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210428_124351_atlas_sunrgbd.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_sunrgbd.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210809_112435.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210809_112435_imvoxelnet_sunrgbd_top27.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_sunrgbd_top27.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105255.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_105255_imvoxelnet_sunrgbd_fast.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_sunrgbd_fast.py)|\n| ScanNet   | 18 from VoteNet | v1 \u0026#124; mAP@0.25: 40.6 \u003cbr\u003e v2 \u0026#124;  mAP@0.25: 45.7 \u003cbr\u003e v3 \u0026#124; mAP@0.25: 48.1 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210520_223109.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210520_223109_atlas_scannet.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_scannet.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210808_070616.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.1/20210808_070616_imvoxelnet_scannet_top27.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_scannet_top27.py) \u003cbr\u003e [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_113826.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.2/20211007_113826_imvoxelnet_scannet_fast.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_scannet_fast.py)|\n| KITTI     | Car | v1 \u0026#124; AP@0.7: 17.8 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210503_214214.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210503_214214_atlas_kitti.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_kitti.py) |\n| nuScenes  | Car | v1 \u0026#124; AP: 51.8 | [model](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210505_131108.pth) \u0026#124; [log](https://github.com/saic-vul/imvoxelnet/releases/download/v1.0/20210505_131108_atlas_nuscenes.log) \u0026#124; [config](configs/imvoxelnet/imvoxelnet_nuscenes.py) |\n\n### Example Detections\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"./resources/github.png\" alt=\"drawing\" width=\"90%\"/\u003e\u003c/p\u003e\n\n### Citation\n\nIf you find this work useful for your research, please cite our paper:\n```\n@inproceedings{rukhovich2022imvoxelnet,\n  title={Imvoxelnet: Image to voxels projection for monocular and multi-view general-purpose 3d object detection},\n  author={Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},\n  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},\n  pages={2397--2406},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSamsungLabs%2Fimvoxelnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSamsungLabs%2Fimvoxelnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSamsungLabs%2Fimvoxelnet/lists"}