{"id":13644660,"url":"https://github.com/wmcnally/kapao","last_synced_at":"2025-04-21T10:33:57.845Z","repository":{"id":38277624,"uuid":"403722591","full_name":"wmcnally/kapao","owner":"wmcnally","description":"KAPAO is an efficient single-stage human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.","archived":false,"fork":false,"pushed_at":"2022-11-02T10:33:26.000Z","size":153098,"stargazers_count":751,"open_issues_count":30,"forks_count":103,"subscribers_count":27,"default_branch":"master","last_synced_at":"2024-11-09T17:42:03.871Z","etag":null,"topics":["deep-learning","human-pose-estimation","pose-estimation","pytorch","yolo"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wmcnally.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-09-06T18:27:34.000Z","updated_at":"2024-11-05T07:59:22.000Z","dependencies_parsed_at":"2023-01-21T04:19:32.307Z","dependency_job_id":null,"html_url":"https://github.com/wmcnally/kapao","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wmcnally%2Fkapao","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wmcnally%2Fkapao/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wmcnally%2Fkapao/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wmcnally%2Fkapao/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wmcnally","download_url":"https://codeload.github.com/wmcnally/kapao/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250040581,"owners_count":21365134,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","human-pose-estimation","pose-estimation","pytorch","yolo"],"created_at":"2024-08-02T01:02:10.386Z","updated_at":"2025-04-21T10:33:52.825Z","avatar_url":"https://github.com/wmcnally.png","language":"Python","funding_links":[],"categories":["Multi-Person 2D Pose Estimation","Object Detection Applications","Python"],"sub_categories":["2021"],"readme":"# KAPAO (Keypoints and Poses as Objects)\n\n[Accepted to ECCV 2022](https://arxiv.org/abs/2111.08557)\n\nKAPAO is an efficient single-stage multi-person human pose estimation method that models \n**k**eypoints **a**nd **p**oses **a**s **o**bjects within a dense anchor-based detection framework.\nKAPAO simultaneously detects _pose objects_ and _keypoint objects_ and fuses the detections to predict human poses:\n\n![alt text](./res/kapao_inference.gif)\n\nWhen not using test-time augmentation (TTA), KAPAO is much faster and more accurate than \nprevious single-stage methods like \n[DEKR](https://github.com/HRNet/DEKR), \n[HigherHRNet](https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation),\n[HigherHRNet + SWAHR](https://github.com/greatlog/SWAHR-HumanPose), and\n[CenterGroup](https://github.com/dvl-tum/center-group):\n\n![alt text](./res/accuracy_latency.png)\n\nThis repository contains the official PyTorch implementation for the paper: \u003cbr\u003e\nRethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation.\n\nOur code was forked from ultralytics/yolov5 at commit [5487451](https://github.com/ultralytics/yolov5/tree/5487451).\n\n### Setup\n1. If you haven't already, [install Anaconda or Miniconda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html).\n2. Create a new conda environment with Python 3.6: `$ conda create -n kapao python=3.6`.\n3. Activate the environment: `$ conda activate kapao`\n4. Clone this repo: `$ git clone https://github.com/wmcnally/kapao.git`\n5. Install the dependencies: `$ cd kapao \u0026\u0026 pip install -r requirements.txt`\n6. Download the trained models: `$ python data/scripts/download_models.py`\n\n## Inference Demos\n\n**Note:** FPS calculations include **all processing** (i.e., including image loading, resizing, inference, plotting / tracking, etc.).\nSee script arguments for inference options.\n\n---\n\n#### Static Image\n\nTo generate the four images in the GIF above:\n1. `$ python demos/image.py --bbox`\n2. `$ python demos/image.py --bbox --pose --face --no-kp-dets`\n3. `$ python demos/image.py --bbox --pose --face --no-kp-dets --kp-bbox`\n4. `$ python demos/image.py --pose --face`\n\n#### Shuffling Video\nKAPAO runs fastest on low resolution video with few people in the frame. This demo runs KAPAO-S on a single-person 480p dance video using an input size of 1024. \nThe inference speed is **~9.5 FPS** on our CPU, and **~60 FPS** on our TITAN Xp.\n\n**CPU inference:**\u003cbr\u003e\n![alt text](./res/yBZ0Y2t0ceo_480p_kapao_s_coco_cpu.gif)\u003cbr\u003e\n\nTo display the results in real-time: \u003cbr\u003e \n`$ python demos/video.py --face --display`\n\nTo create the GIF above:\u003cbr\u003e\n`$ python demos/video.py --face --device cpu --gif`\n\n**CPU specs:**\u003cbr\u003e\nIntel Core i7-8700K\u003cbr\u003e\n16GB DDR4 3000MHz\u003cbr\u003e\nSamsung 970 Pro M.2 NVMe SSD\u003cbr\u003e\n\n---\n\n#### Flash Mob Video\nThis demo runs KAPAO-S on a 720p flash mob video using an input size of 1280.\n\n**GPU inference:**\u003cbr\u003e\n![alt text](./res/2DiQUX11YaY_720p_kapao_s_coco_gpu.gif)\u003cbr\u003e\n\nTo display the results in real-time: \u003cbr\u003e \n`$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --display`\n\nTo create the GIF above:\u003cbr\u003e\n`$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --gif`\n\n---\n\n#### Red Light Green Light\nThis demo runs KAPAO-L on a 480p clip from the TV show _Squid Game_ using an input size of 1024.\nThe plotted poses constitute keypoint objects only.\n\n**GPU inference:**\u003cbr\u003e\n![alt text](./res/nrchfeybHmw_480p_kapao_l_coco_gpu.gif)\u003cbr\u003e\n\nTo display the results in real-time:\u003cbr\u003e\n`$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --display`\n\nTo create the GIF above:\u003cbr\u003e\n`$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --gif`\n\n---\n\n#### Squash Video\nThis demo runs KAPAO-S on a 1080p slow motion squash video. It uses a simple player tracking algorithm based on the frame-to-frame pose differences.\n\n**GPU inference:**\u003cbr\u003e\n![alt text](./res/squash_inference_kapao_s_coco.gif)\u003cbr\u003e\n\nTo display the inference results in real-time: \u003cbr\u003e \n`$ python demos/squash.py --display --fps`\n\nTo create the GIF above:\u003cbr\u003e\n`$ python demos/squash.py --start 42 --end 50 --gif --fps`\n\n---\n\n#### Depth Video\nPose objects generalize well and can even be detected in depth video. \nHere KAPAO-S was run on a depth video from a [fencing action recognition dataset](https://ieeexplore.ieee.org/abstract/document/8076041?casa_token=Zvm7dLIr1rYAAAAA:KrqtVl3NXrJZn05Eb4KGMio-18VPHc3uyDJZSiNJyI7f7oHQ5V2iwB7bK4mCJCmN83NrRl4P). \n\n![alt text](./res/2016-01-04_21-33-35_Depth_kapao_s_coco_gpu.gif)\u003cbr\u003e\n\nThe depth video above can be downloaded directly from [here](https://drive.google.com/file/d/1n4so5WN6snyCYxeUk4xX1glADqQuitXP/view?usp=sharing).\nTo create the GIF above:\u003cbr\u003e\n`$ python demos/video.py -p 2016-01-04_21-33-35_Depth.avi --face --start 0 --end -1 --gif --gif-size 480 360`\n\n---\n\n#### Web Demo\nA web demo was integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio) (credit to [@AK391](https://github.com/AK391)). \nIt uses KAPAO-S to run CPU inference on short video clips.\n\n\n## COCO Experiments\nDownload the COCO dataset:  `$ sh data/scripts/get_coco_kp.sh`\n\n### Validation (without TTA)\n- KAPAO-S (63.0 AP): `$ python val.py --rect`\n- KAPAO-M (68.5 AP): `$ python val.py --rect --weights kapao_m_coco.pt`\n- KAPAO-L (70.6 AP): `$ python val.py --rect --weights kapao_l_coco.pt`\n\n### Validation (with TTA)\n- KAPAO-S (64.3 AP): `$ python val.py --scales 0.8 1 1.2 --flips -1 3 -1`\n- KAPAO-M (69.6 AP): `$ python val.py --weights kapao_m_coco.pt \\ `\u003cbr\u003e\n`--scales 0.8 1 1.2 --flips -1 3 -1` \n- KAPAO-L (71.6 AP): `$ python val.py --weights kapao_l_coco.pt \\ `\u003cbr\u003e\n`--scales 0.8 1 1.2 --flips -1 3 -1` \n\n### Testing\n- KAPAO-S (63.8 AP): `$ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test` \n- KAPAO-M (68.8 AP): `$ python val.py --weights kapao_m_coco.pt \\ `\u003cbr\u003e\n`--scales 0.8 1 1.2 --flips -1 3 -1 --task test` \n- KAPAO-L (70.3 AP): `$ python val.py --weights kapao_l_coco.pt \\ `\u003cbr\u003e\n`--scales 0.8 1 1.2 --flips -1 3 -1 --task test` \n\n\n### Training\nThe following commands were used to train the KAPAO models on 4 V100s with 32GB memory each.\n\nKAPAO-S:\n```\npython -m torch.distributed.launch --nproc_per_node 4 train.py \\\n--img 1280 \\\n--batch 128 \\\n--epochs 500 \\\n--data data/coco-kp.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5s6.pt \\\n--project runs/s_e500 \\\n--name train \\\n--workers 128\n```\n\nKAPAO-M:\n```\npython train.py \\\n--img 1280 \\\n--batch 72 \\\n--epochs 500 \\\n--data data/coco-kp.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5m6.pt \\\n--project runs/m_e500 \\\n--name train \\\n--workers 128\n```\n\nKAPAO-L:\n```\npython train.py \\\n--img 1280 \\\n--batch 48 \\\n--epochs 500 \\\n--data data/coco-kp.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5l6.pt \\\n--project runs/l_e500 \\\n--name train \\\n--workers 128\n```\n\n**Note:** [DDP](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue.\n\n## CrowdPose Experiments\n- Install the [CrowdPose API](https://github.com/Jeff-sjtu/CrowdPose/tree/master/crowdpose-api) to your conda environment: \u003cbr\u003e\n`$ cd .. \u0026\u0026 git clone https://github.com/Jeff-sjtu/CrowdPose.git` \u003cbr\u003e\n`$ cd CrowdPose/crowdpose-api/PythonAPI \u0026\u0026 sh install.sh \u0026\u0026 cd ../../../kapao`\n- Download the CrowdPose dataset:  `$ sh data/scripts/get_crowdpose.sh`\n\n### Testing\n- KAPAO-S (63.8 AP): `$ python val.py --data crowdpose.yaml \\ `\u003cbr\u003e\n`--weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1` \n- KAPAO-M (67.1 AP): `$ python val.py --data crowdpose.yaml \\ `\u003cbr\u003e\n`--weights kapao_m_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1`\n- KAPAO-L (68.9 AP): `$ python val.py --data crowdpose.yaml \\ `\u003cbr\u003e\n`--weights kapao_l_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1`\n\n### Training\nThe following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. \nTraining was performed on the `trainval` split with no validation. \nThe test results above were generated using the last model checkpoint.\n\nKAPAO-S:\n```\npython -m torch.distributed.launch --nproc_per_node 4 train.py \\\n--img 1280 \\\n--batch 128 \\\n--epochs 300 \\\n--data data/crowdpose.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5s6.pt \\\n--project runs/cp_s_e300 \\\n--name train \\\n--workers 128 \\\n--noval\n```\nKAPAO-M:\n```\npython train.py \\\n--img 1280 \\\n--batch 72 \\\n--epochs 300 \\\n--data data/crowdpose.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5m6.pt \\\n--project runs/cp_m_e300 \\\n--name train \\\n--workers 128 \\\n--noval\n```\nKAPAO-L:\n```\npython train.py \\\n--img 1280 \\\n--batch 48 \\\n--epochs 300 \\\n--data data/crowdpose.yaml \\\n--hyp data/hyps/hyp.kp-p6.yaml \\\n--val-scales 1 \\\n--val-flips -1 \\\n--weights yolov5l6.pt \\\n--project runs/cp_l_e300 \\\n--name train \\\n--workers 128 \\\n--noval\n```\n\n## Acknowledgements\nThis work was supported in part by Compute Canada, the Canada Research Chairs Program, \nthe Natural Sciences and Engineering Research Council of Canada, \na Microsoft Azure Grant, and an NVIDIA Hardware Grant.\n\nIf you find this repo is helpful in your research, please cite our paper:\n```\n@article{mcnally2021kapao,\n  title={Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation},\n  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},\n  journal={arXiv preprint arXiv:2111.08557},\n  year={2021}\n}\n```\nPlease also consider citing our previous works:\n```\n@inproceedings{mcnally2021deepdarts,\n  title={DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera},\n  author={McNally, William and Walters, Pascale and Vats, Kanav and Wong, Alexander and McPhee, John},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages={4547--4556},\n  year={2021}\n}\n\n@article{mcnally2021evopose2d,\n  title={EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation Using Accelerated Neuroevolution With Weight Transfer},\n  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},\n  journal={IEEE Access},\n  volume={9},\n  pages={139403--139414},\n  year={2021},\n  publisher={IEEE}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwmcnally%2Fkapao","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwmcnally%2Fkapao","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwmcnally%2Fkapao/lists"}