{"id":13443183,"url":"https://github.com/hailanyi/CasA","last_synced_at":"2025-03-20T16:30:36.981Z","repository":{"id":60364288,"uuid":"465680175","full_name":"hailanyi/CasA","owner":"hailanyi","description":"A Cascade Attention Network for 3D Object Detection from LiDAR point clouds","archived":false,"fork":false,"pushed_at":"2024-07-23T11:37:02.000Z","size":4388,"stargazers_count":130,"open_issues_count":14,"forks_count":26,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-10-28T06:57:40.442Z","etag":null,"topics":["3d-object-detection","casa","cascade-rcnn","kitti"],"latest_commit_sha":null,"homepage":"https://ieeexplore.ieee.org/abstract/document/9870747","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hailanyi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-03T10:53:17.000Z","updated_at":"2024-10-22T13:56:12.000Z","dependencies_parsed_at":"2023-12-27T03:23:39.565Z","dependency_job_id":"9efc8438-4992-4c2d-a1f3-68fa4d4400a8","html_url":"https://github.com/hailanyi/CasA","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hailanyi%2FCasA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hailanyi%2FCasA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hailanyi%2FCasA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hailanyi%2FCasA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hailanyi","download_url":"https://codeload.github.com/hailanyi/CasA/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244649689,"owners_count":20487469,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-object-detection","casa","cascade-rcnn","kitti"],"created_at":"2024-07-31T03:01:57.200Z","updated_at":"2025-03-20T16:30:36.364Z","avatar_url":"https://github.com/hailanyi.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\n# CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds \n\n`CasA` is a simple multi-stage 3D object detection framework based on a Cascade Attention design.\n`CasA` can be integrated into many SoTA 3D detectors and greatly improve their detection performance. \nThe paper of \"CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds\" can be found [here](https://ieeexplore.ieee.org/abstract/document/9870747).  \nThis code is mostly built upon [OpenPCDet](https://github.com/open-mmlab/OpenPCDet). Note that, the CasA++ is based on a transfer learning framework: pre-training on Waymo and fine-tuning on KITTI. Since additional data has been included, we did not release the CasA++ codes. \n\n## Overview\n- [Cascade Attention Design](#cascade-attention-design)\n- [Model Zoo](#model-zoo)\n- [Getting Started](#getting-started)\n- [Citation](#citation)\n\n\n## Cascade Attention Design\nCascade frameworks have been widely studied in\n2D object detection but less investigated in 3D\nspace. Conventional cascade structures use multiple separate sub-networks to sequentially refine\nregion proposals. Such methods, however, have\nlimited ability to measure proposal quality in all\nstages, and hard to achieve a desirable detection\nperformance improvement in 3D space. We\npropose a new cascade framework, termed CasA,\nfor 3D object detection from point clouds. CasA\nconsists of a Region Proposal Network (RPN) and\na Cascade Refinement Network (CRN). In this\nCRN, we designed a new Cascade Attention Module that uses multiple sub-networks and attention\nmodules to aggregate the object features from different stages and progressively refine region proposals.\nCasA can be integrated into various two-stage 3D detectors and greatly improve their detection performance. \nExtensive experimental results\non KITTI and Waymo datasets with various baseline detectors demonstrate the universality and superiority \nof our CasA. In particular, based on one\nvariant of Voxel-RCNN, we achieve state-of-the-art\nresults on KITTI 3D object detection benchmark.\n\n![framework](./docs/framework.png)\n\n## Update Log\n\n* 2022/10/15 Update a 3D multi-object tracker [CasTrack](https://github.com/hailanyi/3D-Multi-Object-Tracker) based on the CasA detections, currently **rank first** on the KITTI tracking leader-board :fire:!\n\n* 2022/9/30 Update details of [installation](#installation). Update [environment](#environment-we-tested) we tested. Update [Spconv2.X](https://github.com/traveller59/spconv) support :rocket:!\n\n* 2022/3/3 Initial update, achieve SOTA performance on the KITTI 3D detection leader-board\n\n## Model Zoo\n\n### KITTI 3D Object Detection Results\nThe results are the 3D detection performance of moderate difficulty on the *val* set of KITTI dataset.\nCurrently, this repo supports CasA-PV, CasA-V, CasA-T and CasA-PV2. The base detectors are \nPV-RCNN, Voxel-RCNN, CT3D and PV-RCNN++, respectively.\n* All released models are trained with 2 3090 GPUs and are available for download. \n* These models are not suitable to directly report results on KITTI *test* set, please use slightly lower score threshold and \ntrain the models on all or 80% training data to achieve a desirable performance on KITTI *test* set.\n\n#### PV-RCNN VS. CasA-PV\n|               Detectors               | Car(R11/R40) | Pedestrian(R11/R40) | Cyclist(R11/R40)  | download |\n|:---------------------------------------------:|:-------:|:-------:|:-------:|:---------:|\n| [PV-RCNN baseline](https://github.com/open-mmlab/OpenPCDet) | 83.90/84.83 | 57.90/56.67 | 70.47/71.95 |   | \n| [CasA-PV](tools/cfgs/kitti_models/CasA-PV.yaml) | **86.18/85.86** | **58.90/59.17** | 66.01/69.09 | [model-44M](https://drive.google.com/file/d/1QolF8lkGwlJDpN3MV7-Y5MdhBCROJnfC/view?usp=sharing) | \n\n#### Voxel-RCNN VS. CasA-V\n|               Detectors               | Car(R11/R40) | Pedestrian(R11/R40) | Cyclist(R11/R40)  | download |\n|:---------------------------------------------:|:-------:|:-------:|:-------:|:---------:|\n| [Voxel-RCNN baseline](https://github.com/open-mmlab/OpenPCDet) | 84.52/85.29 | 61.72/60.97 | 71.48/72.54 |   | \n| [CasA-V](tools/cfgs/kitti_models/CasA-V.yaml)   | **86.54/86.30** | **67.93/66.54** | **74.27/73.08** | [model-44M](https://drive.google.com/file/d/13LO8BAz0h1MbXg97i8k18pHfWGxXEjFP/view?usp=sharing) |\n\n#### CT3D VS. CasA-T\n|               Detectors               | Car(R11/R40) | Pedestrian(R11/R40) | Cyclist(R11/R40)  | download |\n|:---------------------------------------------:|:-------:|:-------:|:-------:|:---------:|\n| [CT3D3cat baseline](https://github.com/hlsheng1/CT3D) | 84.97/85.04 | 56.28/55.58 | 71.71/71.88 |   | \n| [CasA-T](tools/cfgs/kitti_models/CasA-T.yaml)   | **86.76/86.44** | **60.91/62.53** | **73.36**/71.83 | [model-22M](https://drive.google.com/file/d/1pZ4xIa7aTPwAgxUDcbE7b_edctLVXQbb/view?usp=sharing)| \n\n#### PV-RCNN++ VS. CasA-PV2\n|               Detectors               | Car(R11/R40) | Pedestrian(R11/R40) | Cyclist(R11/R40)  | download |\n|:---------------------------------------------:|:-------:|:-------:|:-------:|:---------:|\n| *[PV-RCNN++ baseline](https://github.com/open-mmlab/OpenPCDet) | 85.36/85.50 | 57.43/57.15 | 71.30/71.85 |   | \n| [CasA-PV2](tools/cfgs/kitti_models/CasA-PV2.yaml)   | **86.32/86.10** | **59.50/60.54** | **72.74/73.16** | [model-47M](https://drive.google.com/file/d/1POWX2ruds3t0XOSvBz5-VmG67c4F9mfE/view?usp=sharing) | \n\nWhere * denodes reproduced results of a simplified version using their open-source codes. \n\n### Waymo Open Dataset Results\nHere we provided two models on WOD, where CasA-V-center denotes that the center-based RPN are used.\nAll models are trained with **a single frame**  on 8 V100 GPUs, and the results of each cell here are mAP/mAPH calculated by the official Waymo evaluation metrics on the **whole** validation set (version 1.2).    \n\n|    100\\% Data, 2 returns        | Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 |  \n|:---------------------------------------------:|----------:|:-------:|:-------:|:-------:|:-------:|:-------:|\n| *[Voxel-RCNN baseline](https://github.com/open-mmlab/OpenPCDet)|77.43/76.71| 68.73/68.24 | 76.37/68.21 | 67.92/60.40 | 68.74/67.56 | 66.46/65.35 |\n| [CasA-V](tools/cfgs/waymo_models/CasA-V.yaml)|78.54/78.00| 69.91/69.42 | 80.88/73.10 | 71.87/64.78 | 69.66/68.38 | 67.07/66.83 |\n| [CasA-V-Center](tools/cfgs/waymo_models/CasA-V-Center.yaml) |**78.62/78.04** | **69.94/69.47** | **81.76/75.69** | **72.75/67.21** | **72.47/71.18** | **70.20/68.94**|\n\nWhere * denodes reproduced results using their open-source codes.\n\nWe could not provide the above pretrained models due to [Waymo Dataset License Agreement](https://waymo.com/open/terms/), \nbut you could easily achieve similar performance by training with the default configs.\n\n## Getting Started\n```\nconda create -n spconv2 python=3.9\nconda activate spconv2\npip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html\npip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 waymo-open-dataset-tf-2-5-0 nuscenes-devkit==1.0.5 spconv-cu111 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open3d pandas future pybind11 tensorboardX tensorboard Cython prefetch-generator\n```\n### Environment we tested\n\nOur released implementation is tested on.\n+ Ubuntu 18.04\n+ Python 3.6.9 \n+ PyTorch 1.8.1\n+ Numba 0.53.1\n+ [Spconv 1.2.1](https://github.com/traveller59/spconv/tree/8da6f967fb9a054d8870c3515b1b44eca2103634)\n+ NVIDIA CUDA 11.1\n+ 8x Tesla V100 GPUs\n\nWe also tested on.\n+ Ubuntu 18.04\n+ Python 3.9.13 \n+ PyTorch 1.8.1\n+ Numba 0.53.1\n+ [Spconv 2.1.22](https://github.com/traveller59/spconv) # pip install spconv-cu111\n+ NVIDIA CUDA 11.1 \n+ 2x 3090 GPUs\n\n### Prepare Dataset \n\n#### KITTI Dataset\n\n* Please download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows (the road planes could be downloaded from [[road plane]](https://drive.google.com/file/d/1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp/view?usp=sharing), which are optional for data augmentation in the training):\n\n```\nCasA\n├── data\n│   ├── kitti\n│   │   │── ImageSets\n│   │   │── training\n│   │   │   ├──calib \u0026 velodyne \u0026 label_2 \u0026 image_2 \u0026 (optional: planes)\n│   │   │── testing\n│   │   │   ├──calib \u0026 velodyne \u0026 image_2\n├── pcdet\n├── tools\n```\n\nRun following command to creat dataset infos:\n```\npython3 -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml\n```\n\n\n\n#### Waymo Dataset\n\n```\nCasA\n├── data\n│   ├── waymo\n│   │   │── ImageSets\n│   │   │── raw_data\n│   │   │   │── segment-xxxxxxxx.tfrecord\n|   |   |   |── ...\n|   |   |── waymo_processed_data_train_val_test\n│   │   │   │── segment-xxxxxxxx/\n|   |   |   |── ...\n│   │   │── pcdet_waymo_track_dbinfos_train_cp.pkl\n│   │   │── waymo_infos_test.pkl\n│   │   │── waymo_infos_train.pkl\n│   │   │── waymo_infos_val.pkl\n├── pcdet\n├── tools\n```\n\nRun following command to creat dataset infos:\n```\npython3 -m pcdet.datasets.waymo.waymo_tracking_dataset --cfg_file tools/cfgs/dataset_configs/waymo_tracking_dataset.yaml \n```\n\n#### Installation\n\n```\ngit clone https://github.com/hailanyi/CasA.git\ncd CasA\npython3 setup.py develop\n```\n\n### Training and Evaluation\n\n#### Evaluation\n\n```\ncd tools\npython3 test.py --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}\n```\n\nFor example, if you test the CasA-V model:\n\n```\ncd tools\npython3 test.py --cfg_file cfgs/kitti_models/CasA-V.yaml --ckpt CasA-V.pth\n```\n\nMultiple GPU test: you need modify the gpu number in the dist_test.sh and run\n```\nsh dist_test.sh \n```\nThe log infos are saved into log-test.txt\nYou can run ```cat log-test.txt``` to view the test results.\n\n#### Training\n\n```\ncd tools\npython3 train.py --cfg_file ${CONFIG_FILE}\n```\n\nFor example, if you train the CasA-V model:\n\n```\ncd tools\npython3 train.py --cfg_file cfgs/kitti_models/CasA-V.yaml\n```\n\nMultiple GPU train: you can modify the gpu number in the dist_train.sh and run\n```\nsh dist_train.sh\n```\nThe log infos are saved into log.txt\nYou can run ```cat log.txt``` to view the training process.\n\n## Acknowledgement\nThis repo is developed from `OpenPCDet 0.3`, we thank shaoshuai shi for his implementation of [OpenPCDet](https://github.com/open-mmlab/OpenPCDet).   \n\n## Citation \nIf you find this project useful in your research, please consider cite:\n\n```\n@article{casa2022,\n    title={CasA: A Cascade Attention Network for 3D Object Detection from LiDAR point clouds},\n    author={Wu, Hai and Deng, Jinhao and Wen, Chenglu and Li, Xin and Wang, Cheng},\n    journal={IEEE Transactions on Geoscience and Remote Sensing},\n    year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhailanyi%2FCasA","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhailanyi%2FCasA","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhailanyi%2FCasA/lists"}