{"id":13443562,"url":"https://github.com/zhyever/SimIPU","last_synced_at":"2025-03-20T16:32:07.132Z","repository":{"id":50506707,"uuid":"411678587","full_name":"zhyever/SimIPU","owner":"zhyever","description":"[AAAI 2021] Official Implementation of \"SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations\"","archived":false,"fork":false,"pushed_at":"2022-04-05T01:31:17.000Z","size":3566,"stargazers_count":48,"open_issues_count":5,"forks_count":6,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-08-01T03:44:02.004Z","etag":null,"topics":["aaai","aaai2021","paper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zhyever.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-09-29T13:08:50.000Z","updated_at":"2024-05-23T08:00:30.000Z","dependencies_parsed_at":"2022-08-31T04:31:09.445Z","dependency_job_id":null,"html_url":"https://github.com/zhyever/SimIPU","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhyever%2FSimIPU","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhyever%2FSimIPU/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhyever%2FSimIPU/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhyever%2FSimIPU/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zhyever","download_url":"https://codeload.github.com/zhyever/SimIPU/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221780058,"owners_count":16879040,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aaai","aaai2021","paper"],"created_at":"2024-07-31T03:02:03.849Z","updated_at":"2024-10-28T04:31:21.728Z","avatar_url":"https://github.com/zhyever.png","language":"Python","readme":"# SimIPU\n\n\u003e **SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations**\n\u003e\n\u003e Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang, Bolei Zhou, Hang Zhao\n\u003e\n\u003e [AAAI 2021 (arXiv pdf)](https://arxiv.org/abs/2112.04680)\n\n## Notice\n- Redundancy version of SimIPU. Main codes are in SimIPU/project_cl.\n- You can find codes of MonoDepth [here](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/simipu). We provide detailed configs and results, even in an indoor environment depth dataset, which demonstrates the generalization of SimIPU. Since we enhance the depth framework, model performances are stronger than the ones presented in our paper.\n\n## Usage\n\n### Installation\n\nThis repo is tested on python=3.7, cuda=10.1, pytorch=1.6.0, [mmcv-full=1.3.4](https://github.com/open-mmlab/mmcv), \n[mmdetection=2.11.0](https://github.com/open-mmlab/mmdetection), [mmsegmentation=0.13.0](https://github.com/open-mmlab/mmsegmentation) and \n[mmdetection3D=0.13.0](https://github.com/open-mmlab/mmdetection3d).\n\nNote: since mmdetection and mmdetection3D have made huge compatibility change in their latest versions, their latest version is not compatible with this repo.\nMake sure you install the correct version. \n\nFollow instructions below to install:\n\n- **Create a conda environment**\n\n```\nconda create -n simipu python=3.7\nconda activate monocon\ngit clone https://github.com/zhyever/SimIPU.git\ncd SimIPU\n```\n\n- **Install Pytorch 1.6.0**\n\n```\nconda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch\n```\n\n- **Install mmcv-full=1.3.4**\n\n```\npip install mmcv-full==1.3.4 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html\n```\n\n- **Install mmdetection=2.11.0**\n\n```\ngit clone https://github.com/open-mmlab/mmdetection.git\ncd ./mmdetection\ngit checkout v2.11.0\npip install -r requirements/build.txt\npip install -v -e .\ncd ..\n```\n\n- **Install mmsegmentation=0.13.0**\n\n```\npip install mmsegmentation==0.13.0\n```\n\n- **Build SimIPU**\n\n```\n# remember you have \"cd SimIPU\"\npip install -v -e .\n```\n\n- **Others**\nMaybe there will be notice that there is no required future package after build SimIPU. Install it via conda.\n\n```\nconda install future\n```\n\n### Data Preparation\n\nDownload [KITTI](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) dataset and organize data \n following the [official instructions](https://mmdetection3d.readthedocs.io/en/latest/)\n  in mmdetection3D. Then generate data by running:\n  \n```\npython tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti\n```\n\nIf you would like to run experiments on Mono3D Nus, you should follow the [official instructions](https://mmdetection3d.readthedocs.io/en/latest/) to prepare the NuScenes dataset.\n\nFor Waymo pre-training, we have no plan to release corresponding data-preparing scripts for a short time. Some of the scripts are presented in project_cl/tools/. I just have no effort or resources to reproduce the Waymo pre-training process. Since we provide how to prepare the Waymo dataset in our paper, if you have a problem to achieve it, feel free to contact me and I would like to help you. \n\n### Pre-training on KITTI\n```\nbash tools/dist_train.sh project_cl/configs/simipu/simipu_kitti.py 8 --work-dir work_dir/your/work/dir\n```\n\n### Downstream Evaluation\n#### 1. Camera-lidar fusion based 3D object detection on kitti dataset. \nRemember to change the pre-trained model via changing the value of key `load_from` in the config.\n```\nbash tools/dist_train.sh project_cl/configs/kitti_det3d/moca_r50_kitti.py 8 --work-dir work_dir/your/work/dir\n```\n#### 2. Monocular 3D object detection on Nuscenes dataset. \nRemember to change the pre-trained model via changing the value of key `load_from` in the config. Before training, you also need align the key name in `checkpoint['state_dict']`. See `project_cl/tools/convert_pretrain_imgbackbone.py` for details.\n```\nbash tools/dist_train.sh project_cl/configs/fcos3d_mono3d/fcos3d_r50_nus.py 8 --work-dir work_dir/your/work/dir\n```\n#### 2. Monocular Depth Estimation on KITTI/NYU dataset. \nSee [Depth-Estimation-Toolbox](https://github.com/zhyever/Monocular-Depth-Estimation-Toolbox/tree/main/configs/simipu).\n\n\n## Pre-trained Model and Results\n\nWe provide pre-trained models.\nAs default, the \"Full Waymo or Waymo\" presents Waymo dataset with `load_interval=5`. We use discrete frames to ensure training variety. Previous experiments indicate model improvement with `load_interval=1` is slight. So actually, 1/10 Waymo means 1/5 (`load_interval=5`) times 1/10 (use first 1/10 scene data) = 1/50 Waymo data.\n|   |Dataset |Model  |\n|:-:| :----: | :----:|\n|SimIPU|KITTI|[link](https://github.com/zhyever/SimIPU/releases/download/initial-release/SimIPU_kitti_50e.pth)|\n|SimIPU|Waymo|[link](https://github.com/zhyever/SimIPU/releases/download/initial-release/SimIPU_waymo.pth)|\n|SimIPU|ImageNet Sup + Waymo SimIPU|[link](https://github.com/zhyever/SimIPU/releases/download/double-finetune/SimIPU_imagesup_waymo_double_finetune.pth)|\n\n\nFusion-based 3D object detection results.\n|         | AP40@Easy | AP40@Mod. | AP40@Hard | Link      |\n| :-------: | :---------: |:-----------:|:-----------:|:-----------:|\n| Moca    | 81.32     | 70.88     | 66.19     |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/moca_simipu_kitti.txt) |\n\nMonocular 3D object detection results.\n|           | Pre-train  | mAP         | Link      |\n| :-------: | :---------:|:-----------:|:-----------:|\n| Fcos3D    | Scratch    | 17.9        |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/fcos3d_scratch_nus.txt) |\n| Fcos3D    | 1/10 Waymo SimIPU   | 20.3        |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/fcos3d_simipu_nus_abl_oneten.txt) |\n| Fcos3D    | 1/5 Waymo SimIPU   | 22.5        |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/fcos3d_simipu_nus_abl_onefive.txt) |\n| Fcos3D    | 1/2 Waymo SimIPU  | 24.7        |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/fcos3d_scratch_nus.txt) |\n| Fcos3D    | Full Waymo SimIPU   | 26.2        |  Log |\n| Fcos3D    | ImageNet Sup    | 27.7        |  [Log](https://github.com/zhyever/SimIPU/blob/main/resources/logs/fcos3d_imgnet_nus.txt) |\n| Fcos3D    | ImageNet Sup + Full Waymo SimIPU   | 28.4    |  Log |\n\n\n\n\n\u003c/center\u003e\n\n## Citation\nIf you find our work useful for your research, please consider citing the paper\n```\n@article{li2021simipu,\n  title={SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations},\n  author={Li, Zhenyu and Chen, Zehui and Li, Ang and Fang, Liangji and Jiang, Qinhong and Liu, Xianming and Jiang, Junjun and Zhou, Bolei and Zhao, Hang},\n  journal={arXiv preprint arXiv:2112.04680},\n  year={2021}\n}\n```\n\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhyever%2FSimIPU","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzhyever%2FSimIPU","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhyever%2FSimIPU/lists"}