{"id":29219708,"url":"https://github.com/dvlab-research/voxelnext","last_synced_at":"2025-07-03T02:06:44.989Z","repository":{"id":143760373,"uuid":"614755832","full_name":"dvlab-research/VoxelNeXt","owner":"dvlab-research","description":"VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking (CVPR 2023)","archived":false,"fork":false,"pushed_at":"2023-06-03T07:08:08.000Z","size":8003,"stargazers_count":766,"open_issues_count":39,"forks_count":67,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-20T13:33:24.772Z","etag":null,"topics":["3d-multi-object-tracking","3d-object-detection","argoverse","autonomous-driving","lidar","nuscenes","waymo-open-dataset"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2303.11301","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dvlab-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-03-16T08:55:59.000Z","updated_at":"2025-03-19T11:08:58.000Z","dependencies_parsed_at":"2024-01-16T02:56:31.471Z","dependency_job_id":null,"html_url":"https://github.com/dvlab-research/VoxelNeXt","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dvlab-research/VoxelNeXt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVoxelNeXt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVoxelNeXt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVoxelNeXt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVoxelNeXt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dvlab-research","download_url":"https://codeload.github.com/dvlab-research/VoxelNeXt/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvlab-research%2FVoxelNeXt/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263245317,"owners_count":23436514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-multi-object-tracking","3d-object-detection","argoverse","autonomous-driving","lidar","nuscenes","waymo-open-dataset"],"created_at":"2025-07-03T02:06:42.756Z","updated_at":"2025-07-03T02:06:44.944Z","avatar_url":"https://github.com/dvlab-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/voxelnext-fully-sparse-voxelnet-for-3d-object-1/3d-object-detection-on-argoverse2)](https://paperswithcode.com/sota/3d-object-detection-on-argoverse2?p=voxelnext-fully-sparse-voxelnet-for-3d-object-1)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/voxelnext-fully-sparse-voxelnet-for-3d-object-1/3d-multi-object-tracking-on-nuscenes-lidar)](https://paperswithcode.com/sota/3d-multi-object-tracking-on-nuscenes-lidar?p=voxelnext-fully-sparse-voxelnet-for-3d-object-1)\n\n# VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking (CVPR 2023)\n\nThis is the official implementation of ***VoxelNeXt*** (CVPR 2023). VoxelNeXt is a clean, simple, and fully-sparse 3D object detector. The core idea is to predict objects directly upon sparse voxel features. No sparse-to-dense conversion, anchors, or center proxies are needed anymore.\nFor more details, please refer to:\n\n**VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [[Paper](https://arxiv.org/abs/2303.11301)]** \u003cbr /\u003e\n[Yukang Chen](https://scholar.google.com/citations?user=6p0ygKUAAAAJ\u0026hl=en), [Jianhui Liu](https://scholar.google.com/citations?user=n1JW-jYAAAAJ\u0026hl=en), [Xiangyu Zhang](https://scholar.google.com/citations?user=yuB-cfoAAAAJ\u0026hl=zh-CN), [Xiaojuan Qi](https://scholar.google.com/citations?user=bGn0uacAAAAJ\u0026hl=en), [Jiaya Jia](https://scholar.google.com/citations?user=XPAkzTEAAAAJ\u0026hl=en)\u003cbr /\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"docs/VoxelNeXt-Pipeline.png\" width=\"100%\"\u003e \u003c/p\u003e\n\n## News\n- [2023-04-23] We update the [Argoverse2 dataset code](https://github.com/dvlab-research/VoxelNeXt/blob/master/pcdet/datasets/argo2/argo2_dataset.py). For Argoverse2, [document](https://github.com/dvlab-research/VoxelNeXt/blob/master/docs/GETTING_STARTED.md#argoverse2-dataset) and the [pre-train weight](https://drive.google.com/file/d/1YP2UOz-yO-cWfYQkIqILEu6bodvCBVrR/view?usp=share_link) are updated.\n- [2023-04-19] We merged VoxelNeXt into [Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything).\n- [2023-04-16] We released an [example config file](https://github.com/dvlab-research/VoxelNeXt/blob/master/tools/cfgs/kitti_models/voxelnext.yaml) to train VoxelNeXt on KITTI.\n- [2023-04-14] We combine VoxelNeXt and [Segment Anything](https://github.com/facebookresearch/segment-anything) in [3D-Box-Segment-Anything](https://github.com/dvlab-research/3D-Box-Segment-Anything). It extends [Segment Anything](https://github.com/facebookresearch/segment-anything) to 3D perception and enables promptable 3D object detection.\n- [2023-04-03] VoxelNeXt is merged into the official [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) codebase.\n- [2023-01-28] VoxelNeXt achieved the SOTA performance on the Argoverse2 3D object detection.\n- [2022-11-11] VoxelNeXt achieved 1st on the nuScenes LiDAR tracking leaderboard. \n\n### Experimental results\n\n| nuScenes Detection      |  Set |  mAP |  NDS |   Download  |\n|---------------|:----:|:----:|:----:|:-----------:|\n| [VoxelNeXt](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml)     |  val | 60.5 | 66.6 | [Pre-trained](https://drive.google.com/file/d/1IV7e7G9X-61KXSjMGtQo579pzDNbhwvf/view?usp=share_link) |\n| [VoxelNeXt](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml)     | test | 64.5 | 70.0 |  [Submission](https://drive.google.com/file/d/1wNVjxyTuCE3F88GT_TZSgBgdmkA61Fsi/view?usp=share_link) |\n| [+double-flip](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext_doubleflip.yaml) | test | 66.2 | 71.4 |  [Submission](https://drive.google.com/file/d/1sSkLBrWGm_rMB73cNHojGyQtz-hLBBTH/view?usp=share_link) |\n\n| nuScenes Tracking |  Set |  AMOTA |  AMOTP |   Download  |\n|---------------|:----:|:----:|:----:|:-----------:|\n| [VoxelNeXt](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml)     | val | 70.2 | 64.0 |  [Results](https://drive.google.com/file/d/1_9maBWKJ3oDdUMBB_ee76Cq34GJoGyBx/view?usp=share_link) |\n| [VoxelNeXt](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml)     | test | 69.5 | 56.8 |  [Submission](https://drive.google.com/file/d/1gq-vz5ix_aw4IPc0N3To15IS-bLa1b50/view?usp=share_link) |\n| [+double-flip](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext_doubleflip.yaml) | test | 71.0 | 51.1 |  [Submission](https://drive.google.com/file/d/1wg4Iica6WbPp_NrNoXI3-J1-ggQ2_cW3/view?usp=share_link) |\n\n|  Argoverse2  |  mAP | Download | \n|---------------------------------------------|:----------:|:-------:|\n| [VoxelNeXt](tools/cfgs/argo2_models/cbgs_voxel01_voxelnext.yaml) | 30.5 | [Pre-trained](https://drive.google.com/file/d/1YP2UOz-yO-cWfYQkIqILEu6bodvCBVrR/view?usp=share_link) | \n\n|    Waymo  | Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 |  \n|---------------------------------------------|:----------:|:-------:|:-------:|:-------:|:-------:|:-------:|\n| [VoxelNeXt-2D](tools/cfgs/waymo_models/voxelnext2d_ioubranch.yaml) | 77.94/77.47\t|69.68/69.25\t|80.24/73.47\t|72.23/65.88\t|73.33/72.20\t|70.66/69.56 | \n| [VoxelNeXt-K3](tools/cfgs/waymo_models/voxelnext_ioubranch_large.yaml) | 78.16/77.70\t|69.86/69.42\t|81.47/76.30\t|73.48/68.63\t|76.06/74.90\t|73.29/72.18 |\n\n- We cannot release the pre-trained models of VoxelNeXt on Waymo dataset due to the [license of WOD](https://waymo.com/open/terms).\n- For Waymo dataset, VoxelNeXt-K3 is an enhanced version of VoxelNeXt with larger model size.\n- During inference, VoxelNeXt can work either with [sparse-max-pooling](tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext_maxpool.yaml) or NMS post-processing. Please install our implemented [spconv-plus](https://github.com/dvlab-research/spconv-plus), if you want to use the sparse-max-pooling inference. Otherwise, please use NMS post-processing by default.\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"docs/sequence-v2.gif\" width=\"100%\"\u003e \u003c/p\u003e\n\n## Getting Started\n### Installation\n\n#### a. Clone this repository\n```shell\nhttps://github.com/dvlab-research/VoxelNeXt \u0026\u0026 cd VoxelNeXt\n```\n#### b. Install the environment\n\nFollowing the install documents for [OpenPCDet](docs/INSTALL.md).\n\n#### c. Prepare the datasets. \n\nFor nuScenes, Waymo, and Argoverse2 datasets, please follow the [document](https://github.com/open-mmlab/OpenPCDet/blob/master/docs/GETTING_STARTED.md) in OpenPCDet.\n\n### Evaluation\nWe provide the trained weight file so you can just run with that. You can also use the model you trained.\n\n```shell\ncd tools \nbash scripts/dist_test.sh NUM_GPUS --cfg_file PATH_TO_CONFIG_FILE --ckpt PATH_TO_MODEL\n#For example,\nbash scripts/dist_test.sh 8 --cfg_file PATH_TO_CONFIG_FILE --ckpt PATH_TO_MODEL\n```\n\n\n### Training\n\n```shell\nbash scripts/dist_train.sh NUM_GPUS --cfg_file PATH_TO_CONFIG_FILE\n#For example,\nbash scripts/dist_train.sh 8 --cfg_file PATH_TO_CONFIG_FILE\n```\n\n## Citation \nIf you find this project useful in your research, please consider citing:\n\n```\n@inproceedings{chen2023voxenext,\n  title={VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking},\n  author={Yukang Chen and Jianhui Liu and Xiangyu Zhang and Xiaojuan Qi and Jiaya Jia},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  year={2023}\n}\n\n```\n\nAn introduction video on YouTube can be found here.\n[![IMAGE ALT TEXT](https://img.youtube.com/vi/sXw71BCGWEo/0.jpg)]([VoxelNeXt](https://www.youtube.com/watch?v=sXw71BCGWEo) \"VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking (CVPR 2023)\")\n\n## Acknowledgement\n-  This work is built upon the [OpenPCDet](https://github.com/open-mmlab/OpenPCDet) and [spconv](https://github.com/traveller59/spconv). \n-  This work is motivated by [FSD](https://arxiv.org/abs/2207.10035). And we follow FSD for the Argoverse2 data processing.\n\n## Our Works in LiDAR-based Autonumous Driving\n- **VoxelNeXt (CVPR 2023)** [[Paper]](https://arxiv.org/abs/2303.11301) [[Code]](https://github.com/dvlab-research/VoxelNeXt) Fully Sparse VoxelNet for 3D Object Detection and Tracking.\n- **Focal Sparse Conv (CVPR 2022 Oral)** [[Paper]](https://arxiv.org/abs/2204.12463) [[Code]](https://github.com/dvlab-research/FocalsConv) Dynamic sparse convolution for high performance.\n- **Spatial Pruned Conv (NeurIPS 2022)** [[Paper]](https://arxiv.org/abs/2209.14201) [[Code]](https://github.com/CVMI-Lab/SPS-Conv) 50% FLOPs saving for efficient 3D object detection.\n- **LargeKernel3D (CVPR 2023)** [[Paper]](https://arxiv.org/abs/2206.10555) [[Code]](https://github.com/dvlab-research/LargeKernel3D) Large-kernel 3D sparse CNN backbone.\n- **SphereFormer (CVPR 2023)** [[Paper]](https://arxiv.org/abs/2303.12766) [[Code]](https://github.com/dvlab-research/SphereFormer) Spherical window 3D transformer backbone.\n- [spconv-plus](https://github.com/dvlab-research/spconv-plus) A library where we combine our works into [spconv](https://github.com/traveller59/spconv).\n- [SparseTransformer](https://github.com/dvlab-research/SparseTransformer) A library that includes high-efficiency transformer implementations for sparse point cloud or voxel data.\n\n\n## License\n\nThis project is released under the [Apache 2.0 license](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvlab-research%2Fvoxelnext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdvlab-research%2Fvoxelnext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvlab-research%2Fvoxelnext/lists"}