{"id":13440013,"url":"https://github.com/qq456cvb/CPPF","last_synced_at":"2025-03-20T09:31:35.544Z","repository":{"id":70755666,"uuid":"466698812","full_name":"qq456cvb/CPPF","owner":"qq456cvb","description":"CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild (CVPR2022)","archived":false,"fork":false,"pushed_at":"2024-07-26T20:29:39.000Z","size":1113,"stargazers_count":48,"open_issues_count":1,"forks_count":9,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-08-01T03:30:22.740Z","etag":null,"topics":["cvpr2022","deep-learning","pose-estimation","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qq456cvb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-06T10:04:11.000Z","updated_at":"2024-07-26T20:29:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"999b85a8-98b4-48d1-8fa7-737bcd572c29","html_url":"https://github.com/qq456cvb/CPPF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qq456cvb%2FCPPF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qq456cvb%2FCPPF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qq456cvb%2FCPPF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qq456cvb%2FCPPF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qq456cvb","download_url":"https://codeload.github.com/qq456cvb/CPPF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221745211,"owners_count":16873738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cvpr2022","deep-learning","pose-estimation","pytorch"],"created_at":"2024-07-31T03:01:18.993Z","updated_at":"2025-03-20T09:31:35.538Z","avatar_url":"https://github.com/qq456cvb.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\nCPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild\n\u003c/h1\u003e\n\n\u003cp align='center'\u003e\n\u003cimg align=\"center\" src='images/intro.jpg' width='70%'\u003e \u003c/img\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n\u003ch3\u003e\n\u003ca href=\"https://qq456cvb.github.io\"\u003eYang You\u003c/a\u003e, \u003ca href=\"https://rshi.top/\"\u003eRuoxi Shi\u003c/a\u003e, Weiming Wang, \u003ca href=\"https://www.mvig.org/\"\u003eCewu Lu\u003c/a\u003e\n\u003cbr\u003e\n\u003cbr\u003e\nCVPR 2022\n\u003cbr\u003e\n\u003cbr\u003e\n\u003ca href='https://arxiv.org/pdf/2203.03089.pdf'\u003e\n  \u003cimg src='https://img.shields.io/badge/Paper-PDF-orange?style=flat\u0026logo=arxiv\u0026logoColor=orange' alt='Paper PDF'\u003e\n\u003c/a\u003e\n\u003ca href='https://qq456cvb.github.io/projects/cppf'\u003e\n  \u003cimg src='https://img.shields.io/badge/Project-Page-green?style=flat\u0026logo=googlechrome\u0026logoColor=green' alt='Project Page'\u003e\n\u003c/a\u003e\n\u003ca href='https://youtu.be/MbR3Lq1kJaM'\u003e\n\u003cimg src='https://img.shields.io/badge/Youtube-Video-red?style=flat\u0026logo=youtube\u0026logoColor=red' alt='Video'/\u003e\n\u003c/a\u003e\n\u003cbr\u003e\n\u003c/h3\u003e\n\u003c/div\u003e\n \n  CPPF is a pure sim-to-real method that achieves 9D pose estimation in the wild. Our model is trained solely on ShapeNet synthetic models (without any real-world background pasting), and could be directly applied to real-world scenarios (i.e., NOCS REAL275, SUN RGB-D, etc.). CPPF achieves the goal by using only local $SE3$-invariant geometric features, and leverages a bottom-up voting scheme, which is quite different from previous end-to-end learning methods. Our model is robust to noise, and can obtain decent predictions even if only bounding box masks are provided.\n\n# News\n- **[2024.07]** Check our new object pose estimation benchmark **[PACE](https://github.com/qq456cvb/PACE)** on *ECCV 2024*.\n- **[2024.04]** Check our [CPPF++](https://github.com/qq456cvb/CPPF2) (TPAMI) for even **better results in the wild**!\n- ![cppf++](https://github.com/qq456cvb/CPPF2/blob/main/teaser.gif)\n- **[2022.03]** Our another Detection-by-Voting method [Canonical Voting](https://github.com/qq456cvb/CanonicalVoting), which achieves SoTA on ScanNet, SceneNN, SUN RGB-D is accepted to CVPR 2022.\n\n# Change Logs\n- [2022.05.05] Fix a problem in scale target computing.\n\n# Contents\n- [Overview](#overview)\n- [Installation](#installation)\n- [Train on ShapeNet Objects](#train-on-shapenet-objects)\n- [Pretrained Models](#pretrained-models)\n- [Test on NOCS REAL275](#test-on-nocs-real275)\n- [Test on SUN RGB-D](#test-on-sun-rgb-d)\n- [Train on Your Own Object Collections](#train-on-your-own-object-collections)\n- [Citation](#citation)\n# Overview\n\nThis is the official code implementation of CPPF, including both training and testing. Inference on custom datasets is also supported.\n  \n# Installation\nYou can run the following command to setup an environment, tested on Ubuntu 18.04:\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCreate Conda Env\u003c/b\u003e\u003c/summary\u003e\n\n```\nconda create -n cppf python=3.8\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eInstall Pytorch\u003c/b\u003e\u003c/summary\u003e\n\n```\nconda install pytorch torchvision cudatoolkit=10.2 -c pytorch-lts\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eInstall Other Dependencies\u003c/b\u003e\u003c/summary\u003e\n\n```\npip install tqdm opencv-python scipy matplotlib open3d==0.12.0 hydra-core pyrender cupy-cuda102 PyOpenGL-accelerate OpenEXR\nCXX=g++-7 CC=gcc-7 pip install MinkowskiEngine==0.5.4 -v\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eMiscellaneous\u003c/b\u003e\u003c/summary\u003e\n\nNotice that we use pyrender with OSMesa support, you may need to install OSMesa after running ```pip install pyrender```, more details can be found [here](https://pyrender.readthedocs.io/en/latest/install/index.html).\n\n``MinkowskiEngine`` append its package path into ``sys.path`` (a.k.a., PYTHONPATH), which includes a module named ``utils``. In order not to get messed with our own ``utils`` package, you should import ``MinkowskiEngine`` after importing ``utils``.\n\u003c/details\u003e\n\n# Train on ShapeNet Objects\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eData Preparation\u003c/b\u003e\u003c/summary\u003e\n\nDownload [ShapeNet v2](https://shapenet.org/) dataset and modify the ``shapenet_root`` key in ``config/config.yaml`` to point to the location of the dataset.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eTrain on NOCS REAL275 objects\u003c/b\u003e\u003c/summary\u003e\n\nTo train on synthetic ShapeNet objects that appear in NOCS REAL275, run:\n```\npython train.py category=bottle,bowl,camera,can,laptop,mug -m\n```\n\nFor laptops, an auxiliary segmentation is needed to ensure a unique pose. Please refer to \u003ca href='#laptop-aux'\u003eAuxiliary Segmentation for Laptops\u003c/a\u003e/\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eTrain on SUN RGB-D objects\u003c/b\u003e\u003c/summary\u003e\n\nTo train on synthetic ShapeNet objects that appear in SUN RGB-D, run:\n```\npython train.py category=bathtub,bed,bookshelf,chair,sofa,table -m\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary id='laptop-aux'\u003e\u003cb\u003eAuxiliary Segmentation for Laptops\u003c/b\u003e\u003c/summary\u003e\n\nFor Laptops, geometry alone cannot determine the pose unambiguously, we rely on an auxiliary segmentation network that segments out the lid and the keyboard base.\n\nTo train the segmenter network, first download our Blender physically rendered laptop images from [Google Drive](https://drive.google.com/file/d/1gRHGt47nP9arDAu3hwnDNgfwJMxJYtCa/view?usp=sharing) and place it under ``data/laptop``. Then run the following command:\n```\npython train_laptop_aux.py\n```\n\u003c/details\u003e\n\n\n# Pretrained Models\nPretrained models for various ShapeNet categories can be downloaded from [Google Drive](https://drive.google.com/drive/folders/11wm5WHDjmSBfhng6emxCBBYZexmLoxLk?usp=sharing).\n# Test on NOCS REAL275\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eData Preparation\u003c/b\u003e\u003c/summary\u003e\n\nFirst download the detection priors from [Google Drive](https://drive.google.com/file/d/1cvGiXG_2ya8CMHss1IDobdL81qeODOrE/view?usp=sharing), which is used for evaluation with instance segmentation or bounding box masks. Put the directory under ``data/nocs_seg``.\n\nThen download RGB-D images from [NOCS REAL275](http://download.cs.stanford.edu/orion/nocs/real_test.zip) dataset and put it under ``data/nocs``.\n\nPlace (pre-)trained models under ``checkpoints``.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEvaluate with Instance Segmentation Mask\u003c/b\u003e\u003c/summary\u003e\n\nFirst save inference outputs:\n```\npython nocs/inference.py --adaptive_voting\n``` \n\nThen evaluate mAP: \n```\npython nocs/eval.py | tee nocs/map.txt\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEvaluate with Bounding Box Mask\u003c/b\u003e\u003c/summary\u003e\n\nFirst save inference outputs with bounding box mask enabled:\n```\npython nocs/inference.py --bbox_mask --adaptive_voting\n``` \n\nThen evaluate mAP: \n```\npython nocs/eval.py | tee nocs/map_bbox.txt\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eZero-Shot Instance Segmentation and Pose Estimation\u003c/b\u003e\u003c/summary\u003e\n\nFor this task, due to the memory limitation, we use the regression-based network. You can go through the process by running the jupyter notebook ``nocs/zero_shot.ipynb``.\n\n\u003c/details\u003e\n\n# Test on SUN RGB-D\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eData Preparation\u003c/b\u003e\u003c/summary\u003e\n\nWe follow the same data preparation process as in [VoteNet](https://github.com/facebookresearch/votenet/blob/main/sunrgbd/README.md). You need to first download [SUNRGBD v2 data](http://rgbd.cs.princeton.edu/data/) (``SUNRGBD.zip``, ``SUNRGBDMeta2DBB_v2.mat``, ``SUNRGBDMeta3DBB_v2.mat``) and the toolkits (``SUNRGBDtoolbox.zip``). Move all the downloaded files under ``data/OFFICIAL_SUNRGBD``. Unzip the zip files.\n\nDownload the prepared extra data for SUN RGB-D from [Google Drive](https://drive.google.com/drive/folders/1FSn8j2wIq1VDm5FQNBKuKZ5Wx2G0Ox0S?usp=sharing), and move it under ``data/sunrgbd_extra``. Unzip the zip files.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEvaluate with Instance Segmentation Mask\u003c/b\u003e\u003c/summary\u003e\n\nFirst save inference outputs:\n```\npython sunrgbd/inference.py\n``` \n\nThen evaluate mAP: \n```\npython sunrgbd/eval.py | tee sunrgbd/map.txt\n```\n\u003c/details\u003e\n\n# Train on Your Own Object Collections\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eConfiguration Explained\u003c/b\u003e\u003c/summary\u003e\n\nTo train on custom objects, it is necessary to understand some parameters in configuration files.\n- **up_sym**: Whether the objects look like a cylinder from up to bottom (e.g., bottles). This is to ensure the voting target is unambiguous.\n- **right_sym**: Whether the objects look like a cylinder from left to right (e.g., rolls). This is to ensure the voting target is unambiguous.\n- **regress_right**: Whether to predict the right axis. Some symmetric objects only have a up axis well defined (e.g., bowls, bottles), while some do not (e.g., laptops, mugs).\n- **z_right**: Whether the objects are placed such that the right axis is [0, 0, 1] (default: [1, 0, 0]).\n\u003c/details\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eVoting Statistics Generation\u003c/b\u003e\u003c/summary\u003e\n\nNext, we need to know the ``scale_range`` (used for data augmentation, control possible object scales along the diagonal), ``vote_range`` (the range for center voting targets $\\mu,\\nu$), and ``scale_mean`` (the average 3D scale, used for scale voting). To generate them, you may refer to ``gen_stats.py``.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eWrite Configuration Files and Train\u003c/b\u003e\u003c/summary\u003e\n\nAfter you prepare the necessary configurations and voting statistics, you can write your own configuration file similar to that in ``config/category``, and then run ``train.py``.\n\u003c/details\u003e\n\n# Citation\n```\n@inproceedings{you2022cppf,\n  title={CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild},\n  author={You, Yang and Shi, Ruoxi and Wang, Weiming and Lu, Cewu},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqq456cvb%2FCPPF","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqq456cvb%2FCPPF","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqq456cvb%2FCPPF/lists"}