https://github.com/cmu-perceptual-computing-lab/caffe_rtpose

Realtime C++ code for multi-person pose estimation
https://github.com/cmu-perceptual-computing-lab/caffe_rtpose

Last synced: about 1 month ago
JSON representation

Realtime C++ code for multi-person pose estimation

Host: GitHub
URL: https://github.com/cmu-perceptual-computing-lab/caffe_rtpose
Owner: CMU-Perceptual-Computing-Lab
License: other
Created: 2016-09-28T14:48:17.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-07-18T16:45:12.000Z (almost 8 years ago)
Last Synced: 2024-07-31T22:44:13.479Z (11 months ago)
Language: C++
Homepage:
Size: 35.5 MB
Stars: 356
Watchers: 44
Forks: 207
Open Issues: 4
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

        Realtime Multiperson Pose Estimation

====================================

## New version released as library!!!

### Includes hands and face keypoints, Windows version, and it is faster!

### https://github.com/CMU-Perceptual-Computing-Lab/openpose

### This repository is not maintained anymore and it will eventually be closed. Please, move to OpenPose!

## Introduction

C++ code repo for the ECCV 2016 demo, "Realtime Multiperson Pose Estimation", Zhe Cao, Shih-En Wei, Tomas Simon, Yaser Sheikh. Thanks Ginés Hidalgo Martínez for restructuring the code. 

The [full project repo](https://github.com/ZheC/Multi-Person-Pose-Estimation) includes matlab and python version, and training code.

This project is under the terms of the [license](LICENSE).

## Quick Start

1. Required: CUDA & cuDNN installed on your machine.

2. If you have installed OpenCV 2.4 in your system, go to step 3. If you are using OpenCV 3, uncomment the line `# OPENCV_VERSION := 3` on the file `Makefile.config.Ubuntu14.example` (for Ubuntu 14) and/or `Makefile.config.Ubuntu16.example` (for Ubuntu 15 or 16). In addition, OpenCV 3 does not incorporate the `opencv_contrib` module by default. Assuming you have manually installed it and you need to use it, append `opencv_contrib` at the end of the line `LIBRARIES += opencv_core opencv_highgui opencv_imgproc` in the `Makefile` file.

3. Build `caffe` & `rtpose.bin` + download the required caffe models (script tested on Ubuntu 14.04 & 16.04, it uses all the available cores in your machine):**

```

chmod u+x install_caffe_and_cpm.sh

./install_caffe_and_cpm.sh

```

## Running on a video:

```

./build/examples/rtpose/rtpose.bin --video video_file.mp4

```

## Running on your webcam:

```

./build/examples/rtpose/rtpose.bin

```

## Important options:

`--help` <--- It displays all the available options.

`--video input.mp4` <--- Input video. If omitted, will use webcam.

`--camera #` <--- Choose webcam number (default: 0).

`--image_dir path_to_images/` <--- Run on all jpg, png, or bmp images in `path_to_images/`. If omitted, will use webcam.

`--write_frames path/`  <--- Render images with this prefix: path/frame%06d.jpg

`--write_json path/`  <--- Output JSON file with joints with this prefix: path/frame%06d.json

`--no_frame_drops` <--- Don't drop frames. Important for making offline results.

`--no_display` <--- Don't open a display window. Useful if there's no X server.

`--num_gpu 4` <--- Parallelize over this number of GPUs. Default is 1.

`--num_scales 3 --scale_gap 0.15`  <--- Use 3 scales, 1, (1-0.15), (1-0.15*2). Default is one scale=1.

(HD)

`--net_resolution 656x368 --resolution 1280x720` (These are the default values.)

(VGA)

`--net_resolution 496x368 --resolution 640x480`

`--logtostderr` <--- Log messages to standard error.

## Example:

Run on a video `vid.mp4`, render image frames as `output/frame%06d.jpg` and output JSON files as `output/frame%06d.json`, using 3 scales (1.00, 0.85, and 0.70), parallelized over 2 GPUs:

```

./build/examples/rtpose/rtpose.bin --video vid.mp4 --num_gpu 2 --no_frame_drops --write_frames output/ --write_json output/ --num_scales 3 --scale_gap 0.15

```

## Output format:

Each JSON file has a `bodies` array of objects, where each object has an array `joints` containing the joint locations and detection confidence formatted as `x1,y1,c1,x2,y2,c2,...`, where `c` is the confidence in [0,1].

```

{

"version":0.1,

"bodies":[

{"joints":[1114.15,160.396,0.846207,...]},

{"joints":[...]},

]

}

```

where the joint order of the COCO parts is: (see src/rtpose/modelDescriptorFactory.cpp )

```

	part2name {

		{0,  "Nose"},

		{1,  "Neck"},

		{2,  "RShoulder"},

		{3,  "RElbow"},

		{4,  "RWrist"},

		{5,  "LShoulder"},

		{6,  "LElbow"},

		{7,  "LWrist"},

		{8,  "RHip"},

		{9,  "RKnee"},

		{10, "RAnkle"},

		{11, "LHip"},

		{12, "LKnee"},

		{13, "LAnkle"},

		{14, "REye"},

		{15, "LEye"},

		{16, "REar"},

		{17, "LEar"},

		{18, "Bkg"},

	}

```

## Custom Caffe:

We modified and added several Caffe files in `include/caffe` and `src/caffe`. In case you want to use your own Caffe distribution, these are the files we added and modified:

1. Added folders in `include/caffe` and `src/caffe`: `include/caffe/cpm` and `src/caffe/cpm`.

2. Modified files in `include/caffe` (search for `// CPM extra code:` to find the modified code): `data_transformer.hpp`.

3. Modified files in `src/caffe` (search for `// CPM extra code:` to find the modified code): `data_transformer.cpp`, `proto/caffe.proto` and `util/blocking_queue.cpp`.

4. Replaced files: `README.md`.

5. Added files: `install_caffe_and_cpm.sh`, `Makefile.config.Ubuntu14.example` (extracted from `Makefile.config.example`) and `Makefile.config.Ubuntu16.example` (extracted from `Makefile.config.example`).

6. Other added folders: `model/`, `examples/rtpose`, `/include/rtpose` and `/src/rtpose`.

7. Other modified files: `Makefile`.

8. Optional - deleted Caffe files and folders (only to save space): `Makefile.config.example`, `data/`, `examples/` (do not delete `examples/rtpose`) and `models/`.

## Custom Caffe layers:

We created a few Caffe layers (located in `include/caffe/cpm/layers` and `src/caffe/cpm/layers`):

1. ImResizeLayer: Only used for testing (backward pass not implemented). This layer performs 2-D resize over the 4-D data. I.e., given a 4-D input of size (`num` x `channels` x `height_input` x `width_input`), the layer returns a 4-D output of size (`num` x `channels` x `height_output` x `width_output`). It is independently applied to each dimension of `num` and `channels`. Its parameters are:

	1. `factor`: Scaling factor with respect to the input width and height. `factor` is the alternative to the pair of variables [`target_spatial_width`, `target_spatial_height`]. If `factor != 0`, the latter are ignored.

	2. `scale_gap` and `start_scale`: These parameters are related and used for doing scale search in testing mode. If `start_scale = 1` (default), the CNN input patch size is the net resolution (set with `--net_resolution`). `scale_gap` is used to calculate the scale difference between scales. This parameters are related with the flag `--num_scales`. For instance, using `--start_scale 1 --num_scales 3 --scale_gap 0.1` means using 3 scales: 1, 1-0.1, 1-2*0.1, hence the different patch sizes correspond to the net resolution multiplied by these scales values.

	3. `target_spatial_height`: Alternative to `factor`. It sets the output height. Ignored if `factor != 0`.

	4. `target_spatial_width`: Alternative to `factor`. It sets the output width. Ignored if `factor != 0`.

2. NmsLayer: Only used for testing (backward pass not implemented). This layer performs 3-D Non-Maximum Suppression over the 4-D data. I.e., given a 4-D input of size (`num` x `channels` x `height` x `width`), it returns a 4-D output of size (`num` x `num_parts` x `max_peaks+1` x `3`). It is independently applied to each dimension of `num`. The seconds dimension corresponds to the number of limbs (`num_parts`). The third dimension indicates the maximum number of peaks to be analyzed (`max_peaks+1`). Finally, the last one corresponds to the `x`, `y` and `score` values (`3`). Its parameters are:

	1. `max_peaks`: The number of peaks to be considered. The last `total_peaks` - `max_peaks` peaks are discarded.

	2. `num_parts`: The number of limbs to detect (e.g. 15 for MPI and 18 for COCO).

	3. `threshold`: Any input value smaller than this threshold is set to 0.

## Citation

Please cite the paper in your publications if it helps your research:

    @article{cao2016realtime,

	  title={Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},

	  author={Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},

	  journal={arXiv preprint arXiv:1611.08050},

	  year={2016}

	  }

    @inproceedings{wei2016cpm,

      author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh},

      booktitle = {CVPR},

      title = {Convolutional pose machines},

      year = {2016}

      }

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cmu-perceptual-computing-lab/caffe_rtpose

Awesome Lists containing this project

README