{"id":18936209,"url":"https://github.com/aim-uofa/poseur","last_synced_at":"2025-06-22T06:36:07.601Z","repository":{"id":47369373,"uuid":"515588371","full_name":"aim-uofa/Poseur","owner":"aim-uofa","description":"[ECCV 2022] The official repo for the paper \"Poseur: Direct Human Pose Regression with Transformers\".","archived":false,"fork":false,"pushed_at":"2023-11-10T07:16:37.000Z","size":12245,"stargazers_count":181,"open_issues_count":8,"forks_count":14,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-11T20:32:45.899Z","etag":null,"topics":["coco-wholebody","human-pose-estimation","human36m","vision-transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aim-uofa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-07-19T13:05:22.000Z","updated_at":"2025-02-14T05:04:29.000Z","dependencies_parsed_at":"2025-04-11T20:42:54.058Z","dependency_job_id":null,"html_url":"https://github.com/aim-uofa/Poseur","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aim-uofa/Poseur","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aim-uofa%2FPoseur","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aim-uofa%2FPoseur/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aim-uofa%2FPoseur/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aim-uofa%2FPoseur/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aim-uofa","download_url":"https://codeload.github.com/aim-uofa/Poseur/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aim-uofa%2FPoseur/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261249129,"owners_count":23130492,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coco-wholebody","human-pose-estimation","human36m","vision-transformers"],"created_at":"2024-11-08T12:06:23.060Z","updated_at":"2025-06-22T06:36:02.579Z","avatar_url":"https://github.com/aim-uofa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Poseur: Direct Human Pose Regression with Transformers\n\n\n\u003e [**Poseur: Direct Human Pose Regression with Transformers**](https://arxiv.org/pdf/2201.07412.pdf),            \n\u003e Weian Mao\\*, Yongtao Ge\\*, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, Anton van den Hengel  \n\u003e In: European Conference on Computer Vision (ECCV), 2022   \n\u003e *arXiv preprint ([arXiv 2201.07412](https://arxiv.org/pdf/2201.07412))*  \n\u003e (\\* equal contribution)\n\n## News :triangular_flag_on_post:\n[2023/04/17] Release a naive version of Poseur based on ViT backbone. Please see [poseur_vit_base_coco_256x192](configs/poseur/coco/poseur_vit_base_coco_256x192.py).\n\n[2023/04/17] Release a naive version of Poseur trained on COCO-Wholebody dataset. Please see [poseur_coco_wholebody](configs/poseur/coco_wholebody/).\n\n# Introduction\nThis project is bulit upon [MMPose](https://github.com/open-mmlab/mmpose) with commit ID [eeebc652842a9724259ed345c00112641d8ee06d](https://github.com/open-mmlab/mmpose/commit/eeebc652842a9724259ed345c00112641d8ee06d).\n\n# Installation \u0026 Quick Start\n1. Install following packages\n```\npip install easydict einops\n```\n2. Follow the [MMPose instruction](mmpose_README.md) to install the project and set up the datasets (MS-COCO).\n\nFor training on COCO, run:\n```\n./tools/dist_train.sh \\\nconfigs/poseur/coco/poseur_r50_coco_256x192.py 8 \\\n--work-dir work_dirs/poseur_r50_coco_256x192\n```\n\nFor evaluating on COCO, run the following command lines:\n```\nwget https://cloudstor.aarnet.edu.au/plus/s/UXr1Dn9w6ja4fM9/download -O poseur_256x192_res50_6dec_coco.pth\n./tools/dist_test.sh configs/poseur/coco/poseur_res50_coco_256x192.py \\\n    poseur_256x192_r50_6dec_coco.pth 4 \\\n    --eval mAP \\\n    --cfg-options model.filp_fuse_type=\\'type2\\'\n```\n\nFor visualizing on COCO, run the following command lines:\n```\npython demo/top_down_img_demo.py \\\n    configs/poseur/coco/poseur_res50_coco_256x192.py \\\n    poseur_256x192_res50_6dec_coco.pth \\\n    --img-root tests/data/coco/ --json-file tests/data/coco/test_coco.json \\\n    --out-img-root vis_results_poseur\n```\n\n## COCO Keypoint Detection\n\nName | AP | AP.5| AP.75 |download link\n--- |:---:|:---:|:---:|:---:\n[poseur_mobilenetv2_coco_256x192](configs/poseur/coco/poseur_mobilenetv2_coco_256x192.py)| 71.9  | 88.9 |78.6 | [model](https://pan.baidu.com/s/1FZMjT3tN9tV0jYcLfkTlhQ?pwd=x3pu)\n[poseur_mobilenetv2_coco_256x192_12dec](configs/poseur/coco/poseur_mobilenetv2_coco_256x192_12dec.py)| 72.3  | 88.9 |78.9 | [model](https://pan.baidu.com/s/1UiXzMCOMHWXahi54-gM-hw?pwd=6asw)\n[poseur_res50_coco_256x192](configs/poseur/coco/poseur_res50_coco_256x192.py)| 75.5  | 90.7 |82.6 | [model](https://pan.baidu.com/s/1Cd4gaIHuZJSpkG5PNaBVoQ?pwd=ir6u)\n[poseur_hrnet_w32_coco_256x192](configs/poseur/coco/poseur_hrnet_w32_coco_256x192.py)| 76.8  | 91.0 |83.5 | [model](https://pan.baidu.com/s/1c8UBO-Qu1qomJpCae1_hsQ?pwd=tszp)\n[poseur_hrnet_w48_coco_384x288](configs/poseur/coco/poseur_hrnet_w48_coco_384x288.py)| 78.7  | 91.6 |85.1 | [model](https://pan.baidu.com/s/1lcqkpp4QBezfOlpObj8XWA?pwd=ep8r)\n[poseur_hrformer_tiny_coco_256x192_3dec](configs/poseur/coco/poseur_hrformer_tiny_coco_256x192_3dec.py)| 74.2  | 90.1 |81.4 | [model](https://pan.baidu.com/s/1dwyBXnB3vMnjv1puMQzKWg?pwd=zmei)\n[poseur_hrformer_small_coco_256x192_3dec](configs/poseur/coco/poseur_hrformer_small_coco_256x192_3dec.py)| 76.6  | 91.0 |83.4 | [model](https://pan.baidu.com/s/1ELLvGxzHzmSguOoY5jZI1Q?pwd=3tk8)\n[poseur_hrformer_big_coco_256x192](configs/poseur/coco/poseur_hrformer_big_coco_256x192.py)| 78.9  | 91.9 |85.6 | [model](https://pan.baidu.com/s/1gah8xxIJI4P4MJcpTgLBBA?pwd=yqhb)\n[poseur_hrformer_big_coco_384x288](configs/poseur/coco/poseur_hrformer_big_coco_384x288.py)| 79.6  | 92.1 |85.9 | [model](https://pan.baidu.com/s/1NxH4umpyP8M8CneDEizvrQ?pwd=msh8)\n[poseur_vit_base_coco_256x192](configs/poseur/coco/poseur_vit_base_coco_256x192.py)| 76.7  | 90.6 |83.5 | [model](https://pan.baidu.com/s/184gXXjv-pVYak605-qIs2A?pwd=ytj8)\n\n\n## COCO-WholeBody Benchmark (V0.5)\n\nCompare Whole-body pose estimation results with other methods.\n\n|Method           |  body |       | foot  |       | face  |       |  hand |       | whole |       |\n|-----------------| ------| ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | ----- | \n|                 |  AP   | AR    | AP    | AR    |  AP   | AR    | AP    | AR    | AP    | AR    |\n|OpenPose [1]     | 0.563 | 0.612 | 0.532 | 0.645 | 0.482 | 0.626 | 0.198 | 0.342 | 0.338 | 0.449 |\n|HRNet [2]        | 0.659 | 0.709 | 0.314 | 0.424 | 0.523 | 0.582 | 0.300 | 0.363 | 0.432 | 0.520 |\n|HRNet-body [2]   | 0.758 | 0.809 |   -   |   -   |   -   |   -   |   -   |   -   |   -   |   -   |\n|ZoomNet [3]      | 0.743 | 0.802 | 0.798 | 0.869 | 0.623 | 0.701 | 0.401 | 0.498 | 0.541 | 0.658 |\n|ZoomNas [4]      | 0.740 |  -     | 0.617 |   -    | 0.889 |    -   | 0.625 |   -    | 0.654 |  -   |\n|RTMPose [5]      | 0.730 |   -    | 0.734 |   -    | 0.898 |    -   | 0.587 |   -    | 0.669 |  -   |\n|Poseur_ResNet50  | 0.655 | 0.732 | 0.615 | 0.742 | 0.844 | 0.900 | 0.560 | 0.673 | 0.587 | 0.681 |\n|Poseur_HRNet_W32 | 0.680 | 0.753 | 0.668 | 0.780 | 0.863 | 0.912 | 0.604 | 0.706 | 0.620 | 0.707 |\n|Poseur_HRNet_W48 | 0.692 | 0.766 | 0.689 | 0.799 | 0.861 | 0.911 | 0.621 | 0.721 | 0.633 | 0.721 |\n\n### COCO-WholeBody Pretrain Models\n\nName | AP | AP.5| AP.75 |download link\n--- |:---:|:---:|:---:|:---:\n[poseur_res50_coco_wholebody_256x192](configs/poseur/coco_wholebody/res50_coco_wholebody_256x192_poseur.py)| 65.5 | 85.0 | 71.8 | [model](https://pan.baidu.com/s/1p8M4EW3WkMOhX3Yjxf7l_w?pwd=m3qx)\n[poseur_hrnet_w32_coco_wholebody_256x192](configs/poseur/coco_wholebody/hrnet_w32_coco_wholebody_256x192_poseur.py)| 68.0  | 85.8 | 74.4 | [model](https://pan.baidu.com/s/1XslfU6iXqnu7W19u_o3R2Q?pwd=dgsh)\n[poseur_hrnet_w48_coco_wholebody_256x192](configs/poseur/coco_wholebody/hrnet_w48_coco_wholebody_256x192_poseur.py)| 69.2  | 86.0 | 75.7 | [model](https://pan.baidu.com/s/1ru4t45OD6v_F1qBLtL22FA?pwd=hgr4)\n\n\n*Disclaimer:*\n\n- Due to the update of MMPose, the results are slightly different from our original paper.\n- We use the official HRFormer implement from [here](https://github.com/HRNet/HRFormer/tree/main/pose), the implementation in mmpose has not been verified by us.\n\n# Citations\nPlease consider citing our papers in your publications if the project helps your research. BibTeX reference is as follows.\n```BibTeX\n@inproceedings{mao2022poseur,\n  title={Poseur: Direct human pose regression with transformers},\n  author={Mao, Weian and Ge, Yongtao and Shen, Chunhua and Tian, Zhi and Wang, Xinlong and Wang, Zhibin and Hengel, Anton van den},\n  journal = {Proceedings of the European Conference on Computer Vision {(ECCV)}},\n  month = {October},\n  year={2022}\n}\n```\n\n## Reference\n```\n[1] Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2d human pose estimation: New benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)\n[2] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. arXiv preprint arXiv:1902.09212 (2019)\n[3] Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo. Whole-Body Human Pose Estimation in the Wild. (ECCV) (2020)\n[4] Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang: ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2022)\n[5] Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen. RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose. arXiv preprint arXiv:2303.07399 (2023)\n```\n\n## License\n\nFor commercial use, please contact [Chunhua Shen](mailto:chhshen@gmail.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faim-uofa%2Fposeur","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faim-uofa%2Fposeur","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faim-uofa%2Fposeur/lists"}