{"id":27961720,"url":"https://github.com/cmu-perceptual-computing-lab/caffe_rtpose","last_synced_at":"2025-05-07T19:11:07.468Z","repository":{"id":68066661,"uuid":"69473500","full_name":"CMU-Perceptual-Computing-Lab/caffe_rtpose","owner":"CMU-Perceptual-Computing-Lab","description":"Realtime C++ code for multi-person pose estimation","archived":false,"fork":false,"pushed_at":"2017-07-18T16:45:12.000Z","size":37207,"stargazers_count":356,"open_issues_count":4,"forks_count":207,"subscribers_count":44,"default_branch":"master","last_synced_at":"2024-07-31T22:44:13.479Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CMU-Perceptual-Computing-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-09-28T14:48:17.000Z","updated_at":"2024-03-27T14:09:10.000Z","dependencies_parsed_at":"2023-03-24T22:44:49.861Z","dependency_job_id":null,"html_url":"https://github.com/CMU-Perceptual-Computing-Lab/caffe_rtpose","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CMU-Perceptual-Computing-Lab%2Fcaffe_rtpose","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CMU-Perceptual-Computing-Lab%2Fcaffe_rtpose/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CMU-Perceptual-Computing-Lab%2Fcaffe_rtpose/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CMU-Perceptual-Computing-Lab%2Fcaffe_rtpose/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CMU-Perceptual-Computing-Lab","download_url":"https://codeload.github.com/CMU-Perceptual-Computing-Lab/caffe_rtpose/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252940934,"owners_count":21828769,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-07T19:11:05.756Z","updated_at":"2025-05-07T19:11:07.434Z","avatar_url":"https://github.com/CMU-Perceptual-Computing-Lab.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"Realtime Multiperson Pose Estimation\n====================================\n## New version released as library!!!\n### Includes hands and face keypoints, Windows version, and it is faster!\n### https://github.com/CMU-Perceptual-Computing-Lab/openpose\n### This repository is not maintained anymore and it will eventually be closed. Please, move to OpenPose!\n\n\n\n## Introduction\nC++ code repo for the ECCV 2016 demo, \"Realtime Multiperson Pose Estimation\", Zhe Cao, Shih-En Wei, Tomas Simon, Yaser Sheikh. Thanks Ginés Hidalgo Martínez for restructuring the code. \n\nThe [full project repo](https://github.com/ZheC/Multi-Person-Pose-Estimation) includes matlab and python version, and training code.\n\nThis project is under the terms of the [license](LICENSE).\n\n## Quick Start\n1. Required: CUDA \u0026 cuDNN installed on your machine.\n2. If you have installed OpenCV 2.4 in your system, go to step 3. If you are using OpenCV 3, uncomment the line `# OPENCV_VERSION := 3` on the file `Makefile.config.Ubuntu14.example` (for Ubuntu 14) and/or `Makefile.config.Ubuntu16.example` (for Ubuntu 15 or 16). In addition, OpenCV 3 does not incorporate the `opencv_contrib` module by default. Assuming you have manually installed it and you need to use it, append `opencv_contrib` at the end of the line `LIBRARIES += opencv_core opencv_highgui opencv_imgproc` in the `Makefile` file.\n3. Build `caffe` \u0026 `rtpose.bin` + download the required caffe models (script tested on Ubuntu 14.04 \u0026 16.04, it uses all the available cores in your machine):**\n```\nchmod u+x install_caffe_and_cpm.sh\n./install_caffe_and_cpm.sh\n```\n\n## Running on a video:\n```\n./build/examples/rtpose/rtpose.bin --video video_file.mp4\n```\n\n## Running on your webcam:\n```\n./build/examples/rtpose/rtpose.bin\n```\n\n## Important options:\n`--help` \u003c--- It displays all the available options.\n\n`--video input.mp4` \u003c--- Input video. If omitted, will use webcam.\n\n`--camera #` \u003c--- Choose webcam number (default: 0).\n\n`--image_dir path_to_images/` \u003c--- Run on all jpg, png, or bmp images in `path_to_images/`. If omitted, will use webcam.\n\n`--write_frames path/`  \u003c--- Render images with this prefix: path/frame%06d.jpg\n\n`--write_json path/`  \u003c--- Output JSON file with joints with this prefix: path/frame%06d.json\n\n`--no_frame_drops` \u003c--- Don't drop frames. Important for making offline results.\n\n`--no_display` \u003c--- Don't open a display window. Useful if there's no X server.\n\n`--num_gpu 4` \u003c--- Parallelize over this number of GPUs. Default is 1.\n\n`--num_scales 3 --scale_gap 0.15`  \u003c--- Use 3 scales, 1, (1-0.15), (1-0.15*2). Default is one scale=1.\n\n(HD)\n`--net_resolution 656x368 --resolution 1280x720` (These are the default values.)\n\n(VGA)\n`--net_resolution 496x368 --resolution 640x480`\n\n`--logtostderr` \u003c--- Log messages to standard error.\n\n## Example:\nRun on a video `vid.mp4`, render image frames as `output/frame%06d.jpg` and output JSON files as `output/frame%06d.json`, using 3 scales (1.00, 0.85, and 0.70), parallelized over 2 GPUs:\n```\n./build/examples/rtpose/rtpose.bin --video vid.mp4 --num_gpu 2 --no_frame_drops --write_frames output/ --write_json output/ --num_scales 3 --scale_gap 0.15\n```\n\n## Output format:\nEach JSON file has a `bodies` array of objects, where each object has an array `joints` containing the joint locations and detection confidence formatted as `x1,y1,c1,x2,y2,c2,...`, where `c` is the confidence in [0,1].\n\n```\n{\n\"version\":0.1,\n\"bodies\":[\n{\"joints\":[1114.15,160.396,0.846207,...]},\n{\"joints\":[...]},\n]\n}\n```\n\nwhere the joint order of the COCO parts is: (see src/rtpose/modelDescriptorFactory.cpp )\n```\n\tpart2name {\n\t\t{0,  \"Nose\"},\n\t\t{1,  \"Neck\"},\n\t\t{2,  \"RShoulder\"},\n\t\t{3,  \"RElbow\"},\n\t\t{4,  \"RWrist\"},\n\t\t{5,  \"LShoulder\"},\n\t\t{6,  \"LElbow\"},\n\t\t{7,  \"LWrist\"},\n\t\t{8,  \"RHip\"},\n\t\t{9,  \"RKnee\"},\n\t\t{10, \"RAnkle\"},\n\t\t{11, \"LHip\"},\n\t\t{12, \"LKnee\"},\n\t\t{13, \"LAnkle\"},\n\t\t{14, \"REye\"},\n\t\t{15, \"LEye\"},\n\t\t{16, \"REar\"},\n\t\t{17, \"LEar\"},\n\t\t{18, \"Bkg\"},\n\t}\n```\n\n## Custom Caffe:\nWe modified and added several Caffe files in `include/caffe` and `src/caffe`. In case you want to use your own Caffe distribution, these are the files we added and modified:\n\n1. Added folders in `include/caffe` and `src/caffe`: `include/caffe/cpm` and `src/caffe/cpm`.\n2. Modified files in `include/caffe` (search for `// CPM extra code:` to find the modified code): `data_transformer.hpp`.\n3. Modified files in `src/caffe` (search for `// CPM extra code:` to find the modified code): `data_transformer.cpp`, `proto/caffe.proto` and `util/blocking_queue.cpp`.\n4. Replaced files: `README.md`.\n5. Added files: `install_caffe_and_cpm.sh`, `Makefile.config.Ubuntu14.example` (extracted from `Makefile.config.example`) and `Makefile.config.Ubuntu16.example` (extracted from `Makefile.config.example`).\n6. Other added folders: `model/`, `examples/rtpose`, `/include/rtpose` and `/src/rtpose`.\n7. Other modified files: `Makefile`.\n8. Optional - deleted Caffe files and folders (only to save space): `Makefile.config.example`, `data/`, `examples/` (do not delete `examples/rtpose`) and `models/`.\n\n\n## Custom Caffe layers:\nWe created a few Caffe layers (located in `include/caffe/cpm/layers` and `src/caffe/cpm/layers`):\n\n1. ImResizeLayer: Only used for testing (backward pass not implemented). This layer performs 2-D resize over the 4-D data. I.e., given a 4-D input of size (`num` x `channels` x `height_input` x `width_input`), the layer returns a 4-D output of size (`num` x `channels` x `height_output` x `width_output`). It is independently applied to each dimension of `num` and `channels`. Its parameters are:\n\t1. `factor`: Scaling factor with respect to the input width and height. `factor` is the alternative to the pair of variables [`target_spatial_width`, `target_spatial_height`]. If `factor != 0`, the latter are ignored.\n\t2. `scale_gap` and `start_scale`: These parameters are related and used for doing scale search in testing mode. If `start_scale = 1` (default), the CNN input patch size is the net resolution (set with `--net_resolution`). `scale_gap` is used to calculate the scale difference between scales. This parameters are related with the flag `--num_scales`. For instance, using `--start_scale 1 --num_scales 3 --scale_gap 0.1` means using 3 scales: 1, 1-0.1, 1-2*0.1, hence the different patch sizes correspond to the net resolution multiplied by these scales values.\n\t3. `target_spatial_height`: Alternative to `factor`. It sets the output height. Ignored if `factor != 0`.\n\t4. `target_spatial_width`: Alternative to `factor`. It sets the output width. Ignored if `factor != 0`.\n2. NmsLayer: Only used for testing (backward pass not implemented). This layer performs 3-D Non-Maximum Suppression over the 4-D data. I.e., given a 4-D input of size (`num` x `channels` x `height` x `width`), it returns a 4-D output of size (`num` x `num_parts` x `max_peaks+1` x `3`). It is independently applied to each dimension of `num`. The seconds dimension corresponds to the number of limbs (`num_parts`). The third dimension indicates the maximum number of peaks to be analyzed (`max_peaks+1`). Finally, the last one corresponds to the `x`, `y` and `score` values (`3`). Its parameters are:\n\t1. `max_peaks`: The number of peaks to be considered. The last `total_peaks` - `max_peaks` peaks are discarded.\n\t2. `num_parts`: The number of limbs to detect (e.g. 15 for MPI and 18 for COCO).\n\t3. `threshold`: Any input value smaller than this threshold is set to 0.\n\n\n## Citation\nPlease cite the paper in your publications if it helps your research:\n\n\n\n    @article{cao2016realtime,\n\t  title={Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},\n\t  author={Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},\n\t  journal={arXiv preprint arXiv:1611.08050},\n\t  year={2016}\n\t  }\n\n    @inproceedings{wei2016cpm,\n      author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh},\n      booktitle = {CVPR},\n      title = {Convolutional pose machines},\n      year = {2016}\n      }\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmu-perceptual-computing-lab%2Fcaffe_rtpose","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmu-perceptual-computing-lab%2Fcaffe_rtpose","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmu-perceptual-computing-lab%2Fcaffe_rtpose/lists"}