https://github.com/henriktrom/pose_inference

A high-performance, multi-threaded C++ pipeline for real-time multi-camera keypoint detection.
https://github.com/henriktrom/pose_inference

cpp keypoint-detection multi-threading open-source real-time research-software rtmpose

Last synced: about 2 months ago
JSON representation

A high-performance, multi-threaded C++ pipeline for real-time multi-camera keypoint detection.

Host: GitHub
URL: https://github.com/henriktrom/pose_inference
Owner: HenrikTrom
License: cc0-1.0
Created: 2025-05-27T13:14:52.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-08-05T14:50:03.000Z (12 months ago)
Last Synced: 2025-09-05T11:21:24.969Z (11 months ago)
Topics: cpp, keypoint-detection, multi-threading, open-source, real-time, research-software, rtmpose
Language: C++
Homepage:
Size: 6.01 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md
- License: LICENSE
- Citation: Citation.cff

Awesome Lists containing this project

README

# 🧍‍♂️ Pose-Inference

[![DOI](https://zenodo.org/badge/991307682.svg)](https://zenodo.org/badge/latestdoi/991307682)

A high-performance, multi-threaded C++ pipeline for **real-time multi-camera keypoint detection**.

Developed as part of [my PhD thesis](todo-thesis-link), this module enables **3D human pose estimation** from bounding box proposals generated by my [detection pipeline](https://github.com/HenrikTrom/detection-inference).

This module supports deployment in robotic systems for real-time tracking and perception and is part of my **ROS/ROS2** [real-time 3D tracker](https://github.com/HenrikTrom/real-time-3D-tracking) and its [docker-implementation](https://github.com/HenrikTrom/ROSTrack-RT-3D).

![System Setup](content/4cams.gif)

## 🧪 Test results

* Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz, Nvidia 2080 super, Ubuntu 20.04, CUDA 11.8, TensorRT 8.6.1.6, OpenCV 4.10.0 with RTMPose and BATCH_SIZE of 5 -> **Preprocess: ~1ms, NN inference ~4ms, Postprocess: ~1ms (1000 samples)**
* AMD Ryzen 9 7900X3D CPU @ 4.40GHz, Nvidia 4070 super, Ubuntu 20.04, CUDA 12.4, TensorRT 10.9.0.34, OpenCV 4.10.0 with Yolov8 and BATCH_SIZE of 5 -> **Preprocess: <1ms, NN inference ~2ms, Postprocess: ~<1ms (1000 samples)**

## 📑 Citation

If you use this software, please use the GitHub **“Cite this repository”** button at the top(-right) of this page.

## Environment

This repository is designed to run inside the Docker 🐳 container provided here:
[OpenCV-TRT-DEV](https://github.com/HenrikTrom/Docker-OpenCV-TensorRT-Dev)

It includes all necessary dependencies (CUDA, cuDNN, OpenCV, TensorRT, CMake).

### Prerequisites

In addition to the libraries installed in the container, this project relies on:

- 📦 [tensorrt-cpp-api (fork)](https://github.com/HenrikTrom/tensorrt-cpp-api)
*(Originally by [cyrusbehr](https://github.com/cyrusbehr/tensorrt-cpp-api))*
- 🧵 [cpp-utils](https://github.com/HenrikTrom/cpp_utils)
*(Handles multithreading, JSON config parsing, and utility tools)*

#### Environment Variables

Set the required variables (usually done via `.env` or your shell):

```bash
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets system-wide batch size
```

> If `N_CAMERAS` is not set, CMake will default to a batch size of **5**.

Use the `trt.sh` script in `./scripts` to convert your .onnx model to a fixed batch size.

#### Notes

* The batch size is treated as a **hardware constraint**, defined by the number of connected cameras.
* You can change the default batch size in `CMakeLists.txt` to fit your system.
* Although this repo is optimized for YOLOv8 models, you can modify the post-processing stage to support **any ONNX-compatible detection model**.

### Installation

Run the build and installation script:

```bash
sudo ./build_install.sh
```

This will configure the build system, compile the inference pipeline, and generate the binaries.

---

### Usage

Before using the pipeline, ensure the following:

### Environment Variables

These should be defined in your `.env` file or shell environment:

```bash
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets batch size (defaults to 5)
```

> If `N_CAMERAS` is not set, the system assumes a default of **5** cameras.

---

### 🧠 Model Requirements

This repo is designed for trained [RTMPose](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmpose) models exported as `.onnx`.
The model must be exported with a **fixed batch size** matching your multi-camera setup.

CAdapt the configuration files in the `cfg/` folder to reflect your system and model setup.

You can change the default batch size in `CMakeLists.txt` if needed.

---

## Executables

### Benchmark

After configuring your setup:

```bash
./build/inference_benchmark
```

This runs the inference pipeline, processes multi-camera input, and saves images with overlayed bounding boxes and labels to the `inputs/` folder.

### Video Inference Export

This executable iterates over a directory of synchronized .mp4 videos and saves the result for each video in a .json file.

This example usage assumes .mp4 videos in an arbitrary `./test` directory

```bash
./build/video_inference_export test
```

### BBox Overlay

This executable iterates over a directory of synchronized .mp4 videos and exported inference results (from `./build/video_inference_export`). It generates new .mp4 videos with detections and a tiled video similar to the .gif in this readme.

This example usage assumes .mp4 videos and .json files in an arbitrary `./test` directory

```bash
./build/bbox_overlay test
```

---

## 📷 Applications

This inference module is optimized for:

* 3D multi-camera human pose estimation
* Online tracking and interaction
* Real-time robotics perception pipelines

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/henriktrom/pose_inference

Awesome Lists containing this project

README