{"id":13443976,"url":"https://github.com/Banconxuan/RTM3D","last_synced_at":"2025-03-20T17:32:37.690Z","repository":{"id":41283110,"uuid":"232989387","full_name":"Banconxuan/RTM3D","owner":"Banconxuan","description":"The official PyTorch Implementation of RTM3D and KM3D for Monocular 3D Object Detection","archived":false,"fork":false,"pushed_at":"2020-12-30T07:28:35.000Z","size":5747,"stargazers_count":453,"open_issues_count":51,"forks_count":85,"subscribers_count":46,"default_branch":"master","last_synced_at":"2024-10-28T07:42:13.839Z","etag":null,"topics":["3d-object-detection","anchor-free","centernet","geometric-constraints","keypoint-detection","kitti-detection","real-time"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Banconxuan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-10T07:20:11.000Z","updated_at":"2024-10-04T11:39:49.000Z","dependencies_parsed_at":"2022-07-06T12:05:09.268Z","dependency_job_id":null,"html_url":"https://github.com/Banconxuan/RTM3D","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Banconxuan%2FRTM3D","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Banconxuan%2FRTM3D/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Banconxuan%2FRTM3D/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Banconxuan%2FRTM3D/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Banconxuan","download_url":"https://codeload.github.com/Banconxuan/RTM3D/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244660753,"owners_count":20489389,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-object-detection","anchor-free","centernet","geometric-constraints","keypoint-detection","kitti-detection","real-time"],"created_at":"2024-07-31T03:02:15.556Z","updated_at":"2025-03-20T17:32:32.681Z","avatar_url":"https://github.com/Banconxuan.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"## RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving\n## Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training (KM3D)\n\nRTM3D(ECCV2020) and KM3D (namely RTM3D++) are efficiency and accuracy monocular 3D object detection methods for autonomous driving.\n\nWe replaced the post-processing of RTM3D with KM3D's Geometric Reasoning Module (GRM) to increase the speed of inference. \n[**KM3D**](https://arxiv.org/abs/2009.00764), [**RTM3D**](https://arxiv.org/abs/2001.03343)\n\n## Introduction\nRTM3D is a novel one-stage and keypoints-based framework for monocular 3D objects detection. RTM3D is the first real-time system (FPS\u003e24) for monocular image 3D detection while\nachieves state-of-the-art performance on the KITTI benchmark.\nKM3D reformulate the geometric constraints as a differentiable version and embed it into the net-work to reduce running time while maintaining the consistency\nof model outputs in an end-to-end fashion. KM3D achieves 46FPS and SOTA performance on the KITTI benchmark.\nRTM3D and KM3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.\n\n## Highlights\n- **Fast:** 47FPS of single image test speed in KITTI benchmark with 384*1280 resolution\n- **Accuracy:** SOTA on the KITTI benchmark.\n- **Anchor Free:** No 2D or 3D anchor are reauired\n- **Differentiable geometric reasoning module:** Promote the running efficiency and optimize outputs of\nnetwork jointly. Combining the strengths of both CNN and\ngeometric constraints.\n- **Easy to deploy:** RTM3D and KM3D only uses conventional convolution and upsampling operations, and the geometry module only needs to solve SVD, so it is very easy to deploy and accelerate.\n## KM3D Baseline and Model Zoo\nAll experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 1080Ti\n\nIoU Setting 1: Car IoU \u003e 0.5, Pedestrian IoU \u003e 0.25, Cyclist IoU \u003e 0.25\n\nIoU Setting 2: Car IoU \u003e 0.7, Pedestrian IoU \u003e 0.5, Cyclist IoU \u003e 0.5\n\n- Training on KITTI train split and evaluation on val split.\n    - Backbone: ResNet-18\n    - FPS: 46.7 \n    - Model: ([Google Drive](https://drive.google.com/file/d/14ww6mxtitO9aDszZN3ai8N7U1doehvi8/view?usp=sharing)), ([Baidu Cloud](https://pan.baidu.com/s/1zt-O6UzcBVGF-6vg5LzGpA) 提取码：60ks) \n    \n| Class      |AP BEV IoU Setting1      | AP 3D IoU Setting1     |AP BEV IoU Setting2      | AP 3D IoU Setting2     |\n| :----:     | :----:                  | :----:                 |:----:                   | :----:                 |\n| -          | Easy / Moderate / Hard  | Easy / Moderate / Hard | Easy / Moderate / Hard  | Easy / Moderate / Hard |\n| Car        | 55.65, 40.95, 35.61     | 49.10, 35.75, 32.27    | 23.83, 17.94, 16.98     | 17.51, 13.99, 12.73    |\n| Pedestrian | 22.35, 18.50, 17.64     | 21.68, 18.13, 16.95    | 4.50, 3.87, 3.92        | 3.62, 3.75, 3.03       | \n| Cyclist    | 21.25, 15.12, 14.80     | 21.04, 14.77, 14.65    | 10.70, 9.09, 9.09       | 10.01, 9.09, 9.09      | \n\n- Training on KITTI train split and evaluation on val split.\n    - Backbone: DLA-34\n    - FPS: 28.6\n    - Model: ([Google Drive](https://drive.google.com/file/d/16IjRxXtGfS1eDv9IeDZkJUUjx4olEYnK/view?usp=sharing)), ([Baidu Cloud](https://pan.baidu.com/s/1pjr-WDY256xBBusULjqL8A) 提取码：1h6s) \n    \n| Class      |AP BEV IoU Setting1      | AP 3D IoU Setting1     |AP BEV IoU Setting2      | AP 3D IoU Setting2     |\n| :----:     | :----:                  | :----:                 |:----:                   | :----:                 |\n| -          | Easy / Moderate / Hard  | Easy / Moderate / Hard | Easy / Moderate / Hard  | Easy / Moderate / Hard |\n| Car        | 60.98,  45.74,  42.93   | 54.97, 42.68, 36.95    | 25.96, 21.88, 18.88     | 19.19/ 16.70, 16.14    |\n| Pedestrian | 30.38,  26.09,  23.80   | 28.63, 25.09, 20.14    | 11.55, 11.23, 10.76     | 11.37/ 10.85, 10.11    | \n| Cyclist    | 28.69,  18.77,  18.03   | 27.68, 18.30, 17.74    | 9.67, 6.12, 6.21        |  9.14/ 5.97, 5.86      | \n\n- Training on KITTI train split with right images augmentation and evaluation on val split.\n    - Backbone: ResNet-18\n    - FPS: 46.7\n    - Model: ([Google Drive](https://drive.google.com/file/d/1svqj6ef79bzkiwuNIzpiLw_inDjJnSUZ/view?usp=sharing)), ([Baidu Cloud](https://pan.baidu.com/s/1gcAe2t3vmtWaST3tZPHUrg ) 提取码：sr23)\n    \n| Class      |AP BEV IoU Setting1      | AP 3D IoU Setting1     |AP BEV IoU Setting2      | AP 3D IoU Setting2     |\n| :----:     | :----:                  | :----:                 |:----:                   | :----:                 |\n| -          | Easy / Moderate / Hard  | Easy / Moderate / Hard | Easy / Moderate / Hard  | Easy / Moderate / Hard |\n| Car        | 53.79, 39.83, 34.86     | 47.54, 34.97, 31.77    | 25.03, 18.53, 17.45     | 17.50, 14.06, 12.62      |\n| Pedestrian | 23.15, 19.29, 18.25     | 22.33, 18.84, 17.63    | 6.21, 6.13, 5.53        | 5.19, 5.32, 4.55       | \n| Cyclist    | 19.49, 12.43, 12.28     | 19.53, 12.43, 12.28    | 10.77, 9.58, 9.59       | 10.33, 9.09, 9.09     | \n\n- Training on KITTI train split with right images augmentation and evaluation on val split.\n    - Backbone: DLA-34\n    - FPS: 28.6\n    - Model: ([Google Drive](https://drive.google.com/file/d/1oVroM_VOdxvR4qkWe40T2rtahhA795h0/view?usp=sharing)), ([Baidu Cloud](https://pan.baidu.com/s/1rT46n6fajVQ_19gtkaXU4w) 提取码：qqk6) \n    \n| Class      |AP BEV IoU Setting1      | AP 3D IoU Setting1     |AP BEV IoU Setting2      | AP 3D IoU Setting2     |\n| :----:     | :----:                  | :----:                 |:----:                   | :----:                 |\n| -          | Easy / Moderate / Hard  | Easy / Moderate / Hard | Easy / Moderate / Hard  | Easy / Moderate / Hard |\n| Car        | 63.23, 50.35, 44.56     | 59.10, 44.23, 38.04    | 30.05, 23.07, 21.86     | 22.29, 17.45, 16.86    |\n| Pedestrian | 32.42, 27.20, 21.51     | 31.86, 26.75, 21.33    | 14.73, 12.54, 11.74     | 12.92, 11.62, 11.06    | \n| Cyclist    | 34.64, 21.98, 22.07     | 34.01, 21.73, 19.68    | 16.89, 11.18, 10.24     |  14.35, 9.42, 9.25     | \n\n\n## Installation\nPlease refer to [INSTALL.md](readme/INSTALL.md)\n## Dataset preparation\nPlease download the official [KITTI 3D object detection](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize the downloaded files as follows: \n```\nKM3DNet\n├── kitti_format\n│   ├── data\n│   │   ├── kitti\n│   │   |   ├── annotations \n│   │   │   ├── calib /000000.txt .....\n│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)\n│   │   │   ├── label /000000.txt .....\n|   |   |   ├── train.txt val.txt trainval.txt\n├── src\n├── demo_kitti_format\n├── readme\n├── requirements.txt\n``` \n## Quick Demo\nPlease refer to [DEMO.md](readme/DEMO.md) for a quick demo to test with a pretrained model and visualize the predicted results on your custom data or the original KITTI data.\n\n## Getting Started\nPlease refer to [GETTING_STARTED.md](readme/GETTING_STARTED.md) to learn more usage about this project.\n\n## Acknowledgement\n- [**CenterNet**](https://github.com/xingyizhou/CenterNet)\n## License\n\nRTM3D and KM3D are released under the MIT License (refer to the LICENSE file for details).\nPortions of the code are borrowed from, [CenterNet](https://github.com/xingyizhou/CenterNet), [dla](https://github.com/ucbdrive/dla) (DLA network), [DCNv2](https://github.com/CharlesShang/DCNv2)(deformable convolutions), [iou3d](https://github.com/sshaoshuai/PointRCNN) and [kitti_eval](https://github.com/prclibo/kitti_eval) (KITTI dataset evaluation). Please refer to the original License of these projects (See [NOTICE](NOTICE)).\n## Citation\n\nIf you find this project useful for your research, please use the following BibTeX entry.\n\n    @misc{2009.00764,\n    Author = {Peixuan Li},\n    Title = {Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training},\n    Year = {2020},\n    Eprint = {arXiv:2009.00764},\n    }\n    @misc{2001.03343,\n    Author = {Peixuan Li and Huaici Zhao and Pengfei Liu and Feidao Cao},\n    Title = {RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving},\n    Year = {2020},\n    Eprint = {arXiv:2001.03343},\n    }\n    ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBanconxuan%2FRTM3D","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBanconxuan%2FRTM3D","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBanconxuan%2FRTM3D/lists"}