{"id":13442183,"url":"https://github.com/fcjian/AeDet","last_synced_at":"2025-03-20T13:32:52.865Z","repository":{"id":147969839,"uuid":"567271475","full_name":"fcjian/AeDet","owner":"fcjian","description":"AeDet: Azimuth-invariant Multi-view 3D Object Detection, CVPR2023","archived":false,"fork":false,"pushed_at":"2023-06-17T15:22:10.000Z","size":5714,"stargazers_count":72,"open_issues_count":6,"forks_count":5,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-08-01T03:39:10.725Z","etag":null,"topics":["3d-object-detection","multi-view","vision-based-perception"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fcjian.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-11-17T12:48:51.000Z","updated_at":"2024-07-05T13:02:11.000Z","dependencies_parsed_at":"2024-01-16T02:46:35.099Z","dependency_job_id":"e0a1e2d4-5267-446d-8b1c-f67e3b87ef88","html_url":"https://github.com/fcjian/AeDet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fcjian%2FAeDet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fcjian%2FAeDet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fcjian%2FAeDet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fcjian%2FAeDet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fcjian","download_url":"https://codeload.github.com/fcjian/AeDet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221768488,"owners_count":16877647,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-object-detection","multi-view","vision-based-perception"],"created_at":"2024-07-31T03:01:42.650Z","updated_at":"2025-03-20T13:32:52.857Z","avatar_url":"https://github.com/fcjian.png","language":"Python","funding_links":[],"categories":["Python","3. Perception"],"sub_categories":["3.1.2 Multi Sensor Fusion"],"readme":"\n# AeDet: Azimuth-invariant Multi-view 3D Object Detection \n[Paper](https://arxiv.org/abs/2211.12501) \u0026nbsp; \u0026nbsp; [Website](https://fcjian.github.io/aedet)\n\n## News\nAeDet achieves SOTA on Camera-Only nuScenes Detection Task with 53.1% mAP and 62.0% NDS!\n\n## Introduction\nRecent LSS-based multi-view 3D object detection has made tremendous progress, by processing the features in Brid-Eye-View (BEV) via the convolutional detector. However, the typical convolution ignores the radial symmetry of the BEV features and increases the difficulty of the detector optimization. To preserve the inherent property of the BEV features and ease the optimization, we propose an azimuth-equivariant convolution (AeConv) and an azimuth-equivariant anchor. The sampling grid of AeConv is always in the radial direction, thus it can learn azimuth-invariant BEV features. The proposed anchor enables the detection head to learn predicting azimuth-irrelevant targets. In addition, we introduce a camera-decoupled virtual depth to unify the depth prediction for the images with different camera intrinsic parameters. The resultant detector is dubbed Azimuth-equivariant Detector (AeDet). Extensive experiments are conducted on nuScenes, and AeDet achieves a 62.0% NDS, surpassing the recent multi-view 3D object detectors such as PETRv2 and BEVDepth by a large margin.\n\n### Method overview\n\n![method overview](assets/overview.png)\n\n## Quick Start\n### Installation\n**Step 0.** Install [Pytorch](https://pytorch.org/) (v1.9.0).\n\n**Step 1.** Install [MMDetection3D](https://github.com/open-mmlab/mmdetection3d) (v1.0.0rc4).\n```shell\n# install mmcv\npip install mmcv-full\n\n# install mmdetection\npip install git+https://github.com/open-mmlab/mmdetection.git\n\n# install mmsegmentation\npip install git+https://github.com/open-mmlab/mmsegmentation.git\n\n# install mmdetection3d\ncd AeDet/mmdetection3d\npip install -v -e .\n```\n**Step 2.** Install requirements.\n```shell\npip install -r requirements.txt\n```\n**Step 3.** Install AeDet (gpu required).\n```shell\npython setup.py develop\n```\n\n### Data preparation\n**Step 0.** Download nuScenes official dataset.\n\n**Step 1.** Symlink the dataset root to `./data/`.\n```\nln -s ${nuscenes root} ./data/\n```\nThe directory will be as follows.\n```\nAeDet\n├── data\n│   ├── nuScenes\n│   │   ├── maps\n│   │   ├── samples\n│   │   ├── sweeps\n│   │   ├── v1.0-test\n|   |   ├── v1.0-trainval\n```\n**Step 2.** Prepare infos.\n```\npython scripts/gen_info.py\n```\n**Step 3.** Prepare depth gt.\n```\npython scripts/gen_depth_gt.py\n```\n\n### Tutorials\n**Train.**\n```\npython ${EXP_PATH} --amp_backend native -b 8 --gpus 8 [--ckpt_path ${ORIGIN_CKPT_PATH}]\n```\n**Eval.**\n```\npython ${EXP_PATH} --ckpt_path ${EMA_CKPT_PATH} -e -b 8 --gpus 8\n```\n\n## Results\nModel | Image size | #Key frames | CBGS |  mAP  |  NDS  |  Download\n--- |:----------:|:-----------:|:---:|:-----:|:-----:|:---:\n[BEVDepth_R50 (Baseline)](https://github.com/Megvii-BaseDetection/BEVDepth) |  256x704   |      1      |          | 0.315 | 0.367 |  --\n[BEVDepth_R50_2KEY (Baseline)](https://github.com/Megvii-BaseDetection/BEVDepth) |  256x704   |      2      |          | 0.330 | 0.442 |  --\n[AeDet_R50](exps/aedet/aedet_lss_r50_256x704_128x128_24e.py)   |  256x704   |      1      |          | 0.334 | 0.401 |  [google](https://drive.google.com/file/d/1S-NcWXs-7kTsw1qIZooGLMFw-LSBj93i/view?usp=sharing)\n[AeDet_R50_2KEY](exps/aedet/aedet_lss_r50_256x704_128x128_24e_2key.py)   |  256x704   |      2      |          | 0.359 | 0.473 |  [google](https://drive.google.com/file/d/1mExxghQJLCDiuZmYhmpozW7iv0ixmrzj/view?usp=sharing)\n[AeDet_R50_2KEY_CBGS](exps/aedet/aedet_lss_r50_256x704_128x128_20e_cbgs_2key.py)        |  256x704   |      2      |    \u0026check;     | 0.381 | 0.502 |  [google](https://drive.google.com/file/d/19r3kCHGng3rBEHsgAskCXduKnB4Yu3Fq/view?usp=sharing)\n[AeDet_R101_2KEY_CBGS](exps/aedet/aedet_lss_r101_512x1408_256x256_20e_cbgs_2key.py)        |  512x1408  |      2      |    \u0026check;     | 0.449 | 0.562 |  [google](https://drive.google.com/file/d/16L-cH7YSTyDiaGV41JM3zY9QxbxW1tTR/view?usp=sharing)\n\n## Acknowledgement\n\nThanks [BEVDepth](https://github.com/Megvii-BaseDetection/BEVDepth) team and [MMDetection3D](https://github.com/open-mmlab/mmdetection3d) team for the wonderful open source projects!\n\n## Citation\n\nIf you find AeDet useful in your research, please consider citing:\n\n```\n@inproceedings{feng2023aedet,\n    title={AeDet: Azimuth-invariant Multi-view 3D Object Detection},\n    author={Feng, Chengjian and Jie, Zequn and Zhong, Yujie and Chu, Xiangxiang and Ma, Lin},\n    booktitle={Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition},\n    year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffcjian%2FAeDet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffcjian%2FAeDet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffcjian%2FAeDet/lists"}