{"id":21711940,"url":"https://github.com/hrnet/hrnet-human-pose-estimation","last_synced_at":"2025-04-07T14:15:45.587Z","repository":{"id":43082188,"uuid":"203081101","full_name":"HRNet/HRNet-Human-Pose-Estimation","owner":"HRNet","description":"This repo is copied from https://github.com/leoxiaobin/deep-high-resolution-net.pytorch","archived":false,"fork":false,"pushed_at":"2021-10-12T22:58:49.000Z","size":1730,"stargazers_count":280,"open_issues_count":18,"forks_count":74,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-31T12:08:15.301Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://github.com/leoxiaobin/deep-high-resolution-net.pytorch","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HRNet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-19T02:05:36.000Z","updated_at":"2025-03-30T00:45:58.000Z","dependencies_parsed_at":"2022-09-24T05:40:28.735Z","dependency_job_id":null,"html_url":"https://github.com/HRNet/HRNet-Human-Pose-Estimation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HRNet%2FHRNet-Human-Pose-Estimation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HRNet%2FHRNet-Human-Pose-Estimation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HRNet%2FHRNet-Human-Pose-Estimation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HRNet%2FHRNet-Human-Pose-Estimation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HRNet","download_url":"https://codeload.github.com/HRNet/HRNet-Human-Pose-Estimation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247666015,"owners_count":20975788,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-25T23:31:58.155Z","updated_at":"2025-04-07T14:15:45.542Z","avatar_url":"https://github.com/HRNet.png","language":"Cuda","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Deep High-Resolution Representation Learning for Human Pose Estimation (accepted to CVPR2019)\n## News\n- If you are interested in internship or research positions related to computer vision in ByteDance AI Lab, feel free to contact me(leoxiaobin-at-gmail.com).\n- Our new work [High-Resolution Representations for Labeling Pixels and Regions](https://arxiv.org/abs/1904.04514) is available at [HRNet](https://github.com/HRNet). Our HRNet has been applied to a wide range of vision tasks, such as [image classification](https://github.com/HRNet/HRNet-Image-Classification), [objection detection](https://github.com/HRNet/HRNet-Object-Detection), [semantic segmentation](https://github.com/HRNet/HRNet-Semantic-Segmentation) and [facial landmark](https://github.com/HRNet/HRNet-Facial-Landmark-Detection).\n\n## Introduction\nThis is an official pytorch implementation of [*Deep High-Resolution Representation Learning for Human Pose Estimation*](https://arxiv.org/abs/1902.09212). \nIn this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods **recover high-resolution representations from low-resolution representations** produced by a high-to-low resolution network. Instead, our proposed network **maintains high-resolution representations** through the whole process.\nWe start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks **in parallel**. We conduct **repeated multi-scale fusions** such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset. \u003c/br\u003e\n\n![Illustrating the architecture of the proposed HRNet](/figures/hrnet.png)\n## Main Results\n### Results on MPII val\n| Arch               | Head | Shoulder | Elbow | Wrist |  Hip | Knee | Ankle | Mean | Mean@0.1 |\n|--------------------|------|----------|-------|-------|------|------|-------|------|----------|\n| pose_resnet_50     | 96.4 |     95.3 |  89.0 |  83.2 | 88.4 | 84.0 |  79.6 | 88.5 |     34.0 |\n| pose_resnet_101    | 96.9 |     95.9 |  89.5 |  84.4 | 88.4 | 84.5 |  80.7 | 89.1 |     34.0 |\n| pose_resnet_152    | 97.0 |     95.9 |  90.0 |  85.0 | 89.2 | 85.3 |  81.3 | 89.6 |     35.0 |\n| **pose_hrnet_w32** | 97.1 |     95.9 |  90.3 |  86.4 | 89.1 | 87.1 |  83.3 | 90.3 |     37.7 |\n\n### Note:\n- Flip test is used.\n- Input size is 256x256\n- pose_resnet_[50,101,152] is our previous work of [*Simple Baselines for Human Pose Estimation and Tracking*](http://openaccess.thecvf.com/content_ECCV_2018/html/Bin_Xiao_Simple_Baselines_for_ECCV_2018_paper.html)\n\n### Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset\n| Arch               | Input size | #Params | GFLOPs |    AP | Ap .5 | AP .75 | AP (M) | AP (L) |    AR | AR .5 | AR .75 | AR (M) | AR (L) |\n|--------------------|------------|---------|--------|-------|-------|--------|--------|--------|-------|-------|--------|--------|--------|\n| pose_resnet_50     |    256x192 | 34.0M   |    8.9 | 0.704 | 0.886 |  0.783 |  0.671 |  0.772 | 0.763 | 0.929 |  0.834 |  0.721 |  0.824 |\n| pose_resnet_50     |    384x288 | 34.0M   |   20.0 | 0.722 | 0.893 |  0.789 |  0.681 |  0.797 | 0.776 | 0.932 |  0.838 |  0.728 |  0.846 |\n| pose_resnet_101    |    256x192 | 53.0M   |   12.4 | 0.714 | 0.893 |  0.793 |  0.681 |  0.781 | 0.771 | 0.934 |  0.840 |  0.730 |  0.832 |\n| pose_resnet_101    |    384x288 | 53.0M   |   27.9 | 0.736 | 0.896 |  0.803 |  0.699 |  0.811 | 0.791 | 0.936 |  0.851 |  0.745 |  0.858 |\n| pose_resnet_152    |    256x192 | 68.6M   |   15.7 | 0.720 | 0.893 |  0.798 |  0.687 |  0.789 | 0.778 | 0.934 |  0.846 |  0.736 |  0.839 |\n| pose_resnet_152    |    384x288 | 68.6M   |   35.3 | 0.743 | 0.896 |  0.811 |  0.705 |  0.816 | 0.797 | 0.937 |  0.858 |  0.751 |  0.863 |\n| **pose_hrnet_w32** |    256x192 | 28.5M   |    7.1 | 0.744 | 0.905 |  0.819 |  0.708 |  0.810 | 0.798 | 0.942 |  0.865 |  0.757 |  0.858 |\n| **pose_hrnet_w32** |    384x288 | 28.5M   |   16.0 | 0.758 | 0.906 |  0.825 |  0.720 |  0.827 | 0.809 | 0.943 |  0.869 |  0.767 |  0.871 |\n| **pose_hrnet_w48** |    256x192 | 63.6M   |   14.6 | 0.751 | 0.906 |  0.822 |  0.715 |  0.818 | 0.804 | 0.943 |  0.867 |  0.762 |  0.864 |\n| **pose_hrnet_w48** |    384x288 | 63.6M   |   32.9 | 0.763 | 0.908 |  0.829 |  0.723 |  0.834 | 0.812 | 0.942 |  0.871 |  0.767 |  0.876 |\n\n### Note:\n- Flip test is used.\n- Person detector has person AP of 56.4 on COCO val2017 dataset.\n- pose_resnet_[50,101,152] is our previous work of [*Simple Baselines for Human Pose Estimation and Tracking*](http://openaccess.thecvf.com/content_ECCV_2018/html/Bin_Xiao_Simple_Baselines_for_ECCV_2018_paper.html).\n- GFLOPs is for convolution and linear layers only.\n\n\n### Results on COCO test-dev2017 with detector having human AP of 60.9 on COCO test-dev2017 dataset\n| Arch               | Input size | #Params | GFLOPs |    AP | Ap .5 | AP .75 | AP (M) | AP (L) |    AR | AR .5 | AR .75 | AR (M) | AR (L) |\n|--------------------|------------|---------|--------|-------|-------|--------|--------|--------|-------|-------|--------|--------|--------|\n| pose_resnet_152    |    384x288 | 68.6M   |   35.3 | 0.737 | 0.919 |  0.828 |  0.713 |  0.800 | 0.790 | 0.952 |  0.856 |  0.748 |  0.849 |\n| **pose_hrnet_w48** |    384x288 | 63.6M   |   32.9 | 0.755 | 0.925 |  0.833 |  0.719 |  0.815 | 0.805 | 0.957 |  0.874 |  0.763 |  0.863 |\n| **pose_hrnet_w48\\*** |    384x288 | 63.6M   |   32.9 | 0.770 | 0.927 |  0.845 |  0.734 |  0.831 | 0.820 | 0.960 |  0.886 |  0.778 |  0.877 |\n\n### Note:\n- Flip test is used.\n- Person detector has person AP of 60.9 on COCO test-dev2017 dataset.\n- pose_resnet_152 is our previous work of [*Simple Baselines for Human Pose Estimation and Tracking*](http://openaccess.thecvf.com/content_ECCV_2018/html/Bin_Xiao_Simple_Baselines_for_ECCV_2018_paper.html).\n- GFLOPs is for convolution and linear layers only.\n- pose_hrnet_w48\\* means using additional data from [AI challenger](https://challenger.ai/dataset/keypoint) for training.\n\n## Environment\nThe code is developed using python 3.6 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 NVIDIA P100 GPU cards. Other platforms or GPU cards are not fully tested.\n\n## Quick start\n### Installation\n1. Install pytorch \u003e= v1.0.0 following [official instruction](https://pytorch.org/).\n   **Note that if you use pytorch's version \u003c v1.0.0, you should following the instruction at \u003chttps://github.com/Microsoft/human-pose-estimation.pytorch\u003e to disable cudnn's implementations of BatchNorm layer. We encourage you to use higher pytorch's version(\u003e=v1.0.0)**\n2. Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.\n3. Install dependencies:\n   ```\n   pip install -r requirements.txt\n   ```\n4. Make libs:\n   ```\n   cd ${POSE_ROOT}/lib\n   make\n   ```\n5. Install [COCOAPI](https://github.com/cocodataset/cocoapi):\n   ```\n   # COCOAPI=/path/to/clone/cocoapi\n   git clone https://github.com/cocodataset/cocoapi.git $COCOAPI\n   cd $COCOAPI/PythonAPI\n   # Install into global site-packages\n   make install\n   # Alternatively, if you do not have permissions or prefer\n   # not to install the COCO API into global site-packages\n   python3 setup.py install --user\n   ```\n   Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.\n4. Init output(training model output directory) and log(tensorboard log directory) directory:\n\n   ```\n   mkdir output \n   mkdir log\n   ```\n\n   Your directory tree should look like this:\n\n   ```\n   ${POSE_ROOT}\n   ├── data\n   ├── experiments\n   ├── lib\n   ├── log\n   ├── models\n   ├── output\n   ├── tools \n   ├── README.md\n   └── requirements.txt\n   ```\n\n6. Download pretrained models from our model zoo([GoogleDrive](https://drive.google.com/drive/folders/1hOTihvbyIxsm5ygDpbUuJ7O_tzv4oXjC?usp=sharing) or [OneDrive](https://1drv.ms/f/s!AhIXJn_J-blW231MH2krnmLq5kkQ))\n   ```\n   ${POSE_ROOT}\n    `-- models\n        `-- pytorch\n            |-- imagenet\n            |   |-- hrnet_w32-36af842e.pth\n            |   |-- hrnet_w48-8ef0771d.pth\n            |   |-- resnet50-19c8e357.pth\n            |   |-- resnet101-5d3b4d8f.pth\n            |   `-- resnet152-b121ed2d.pth\n            |-- pose_coco\n            |   |-- pose_hrnet_w32_256x192.pth\n            |   |-- pose_hrnet_w32_384x288.pth\n            |   |-- pose_hrnet_w48_256x192.pth\n            |   |-- pose_hrnet_w48_384x288.pth\n            |   |-- pose_resnet_101_256x192.pth\n            |   |-- pose_resnet_101_384x288.pth\n            |   |-- pose_resnet_152_256x192.pth\n            |   |-- pose_resnet_152_384x288.pth\n            |   |-- pose_resnet_50_256x192.pth\n            |   `-- pose_resnet_50_384x288.pth\n            `-- pose_mpii\n                |-- pose_hrnet_w32_256x256.pth\n                |-- pose_hrnet_w48_256x256.pth\n                |-- pose_resnet_101_256x256.pth\n                |-- pose_resnet_152_256x256.pth\n                `-- pose_resnet_50_256x256.pth\n\n   ```\n   \n### Data preparation\n**For MPII data**, please download from [MPII Human Pose Dataset](http://human-pose.mpi-inf.mpg.de/). The original annotation files are in matlab format. We have converted them into json format, you also need to download them from [OneDrive](https://1drv.ms/f/s!AhIXJn_J-blW00SqrairNetmeVu4) or [GoogleDrive](https://drive.google.com/drive/folders/1En_VqmStnsXMdldXA6qpqEyDQulnmS3a?usp=sharing).\nExtract them under {POSE_ROOT}/data, and make them look like this:\n```\n${POSE_ROOT}\n|-- data\n`-- |-- mpii\n    `-- |-- annot\n        |   |-- gt_valid.mat\n        |   |-- test.json\n        |   |-- train.json\n        |   |-- trainval.json\n        |   `-- valid.json\n        `-- images\n            |-- 000001163.jpg\n            |-- 000003072.jpg\n```\n\n**For COCO data**, please download from [COCO download](http://cocodataset.org/#download), 2017 Train/Val is needed for COCO keypoints training and validation. We also provide person detection result of COCO val2017 and test-dev2017 to reproduce our multi-person pose estimation results. Please download from [OneDrive](https://1drv.ms/f/s!AhIXJn_J-blWzzDXoz5BeFl8sWM-) or [GoogleDrive](https://drive.google.com/drive/folders/1fRUDNUDxe9fjqcRZ2bnF_TKMlO0nB_dk?usp=sharing).\nDownload and extract them under {POSE_ROOT}/data, and make them look like this:\n```\n${POSE_ROOT}\n|-- data\n`-- |-- coco\n    `-- |-- annotations\n        |   |-- person_keypoints_train2017.json\n        |   `-- person_keypoints_val2017.json\n        |-- person_detection_results\n        |   |-- COCO_val2017_detections_AP_H_56_person.json\n        |   |-- COCO_test-dev2017_detections_AP_H_609_person.json\n        `-- images\n            |-- train2017\n            |   |-- 000000000009.jpg\n            |   |-- 000000000025.jpg\n            |   |-- 000000000030.jpg\n            |   |-- ... \n            `-- val2017\n                |-- 000000000139.jpg\n                |-- 000000000285.jpg\n                |-- 000000000632.jpg\n                |-- ... \n```\n\n### Training and Testing\n\n#### Testing on MPII dataset using model zoo's models([GoogleDrive](https://drive.google.com/drive/folders/1hOTihvbyIxsm5ygDpbUuJ7O_tzv4oXjC?usp=sharing) or [OneDrive](https://1drv.ms/f/s!AhIXJn_J-blW231MH2krnmLq5kkQ))\n \n\n```\npython tools/test.py \\\n    --cfg experiments/mpii/hrnet/w32_256x256_adam_lr1e-3.yaml \\\n    TEST.MODEL_FILE models/pytorch/pose_mpii/pose_hrnet_w32_256x256.pth\n```\n\n#### Training on MPII dataset\n\n```\npython tools/train.py \\\n    --cfg experiments/mpii/hrnet/w32_256x256_adam_lr1e-3.yaml\n```\n\n#### Testing on COCO val2017 dataset using model zoo's models([GoogleDrive](https://drive.google.com/drive/folders/1hOTihvbyIxsm5ygDpbUuJ7O_tzv4oXjC?usp=sharing) or [OneDrive](https://1drv.ms/f/s!AhIXJn_J-blW231MH2krnmLq5kkQ))\n \n\n```\npython tools/test.py \\\n    --cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3.yaml \\\n    TEST.MODEL_FILE models/pytorch/pose_coco/pose_hrnet_w32_256x192.pth \\\n    TEST.USE_GT_BBOX False\n```\n\n#### Training on COCO train2017 dataset\n\n```\npython tools/train.py \\\n    --cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3.yaml \\\n```\n\n\n### Other applications\nMany other dense prediction tasks, such as segmentation, face alignment and object detection, etc. have been benefited by HRNet. More information can be found at [Deep High-Resolution Representation Learning](https://jingdongwang2017.github.io/Projects/HRNet/).\n\n### Citation\nIf you use our code or models in your research, please cite with:\n```\n@inproceedings{sun2019deep,\n  title={Deep High-Resolution Representation Learning for Human Pose Estimation},\n  author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},\n  booktitle={CVPR},\n  year={2019}\n}\n\n@inproceedings{xiao2018simple,\n    author={Xiao, Bin and Wu, Haiping and Wei, Yichen},\n    title={Simple Baselines for Human Pose Estimation and Tracking},\n    booktitle = {European Conference on Computer Vision (ECCV)},\n    year = {2018}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhrnet%2Fhrnet-human-pose-estimation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhrnet%2Fhrnet-human-pose-estimation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhrnet%2Fhrnet-human-pose-estimation/lists"}