https://github.com/henriktrom/pose_inference
A high-performance, multi-threaded C++ pipeline for real-time multi-camera keypoint detection.
https://github.com/henriktrom/pose_inference
cpp keypoint-detection multi-threading open-source real-time research-software rtmpose
Last synced: 11 days ago
JSON representation
A high-performance, multi-threaded C++ pipeline for real-time multi-camera keypoint detection.
- Host: GitHub
- URL: https://github.com/henriktrom/pose_inference
- Owner: HenrikTrom
- License: cc0-1.0
- Created: 2025-05-27T13:14:52.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-08-05T14:50:03.000Z (10 months ago)
- Last Synced: 2025-09-05T11:21:24.969Z (9 months ago)
- Topics: cpp, keypoint-detection, multi-threading, open-source, real-time, research-software, rtmpose
- Language: C++
- Homepage:
- Size: 6.01 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
- Citation: Citation.cff
Awesome Lists containing this project
README
# 🧍♂️ Pose-Inference
[](https://zenodo.org/badge/latestdoi/991307682)
A high-performance, multi-threaded C++ pipeline for **real-time multi-camera keypoint detection**.
Developed as part of [my PhD thesis](todo-thesis-link), this module enables **3D human pose estimation** from bounding box proposals generated by my [detection pipeline](https://github.com/HenrikTrom/detection-inference).
This module supports deployment in robotic systems for real-time tracking and perception and is part of my **ROS/ROS2** [real-time 3D tracker](https://github.com/HenrikTrom/real-time-3D-tracking) and its [docker-implementation](https://github.com/HenrikTrom/ROSTrack-RT-3D).

## 🧪 Test results
* Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz, Nvidia 2080 super, Ubuntu 20.04, CUDA 11.8, TensorRT 8.6.1.6, OpenCV 4.10.0 with RTMPose and BATCH_SIZE of 5 -> **Preprocess: ~1ms, NN inference ~4ms, Postprocess: ~1ms (1000 samples)**
* AMD Ryzen 9 7900X3D CPU @ 4.40GHz, Nvidia 4070 super, Ubuntu 20.04, CUDA 12.4, TensorRT 10.9.0.34, OpenCV 4.10.0 with Yolov8 and BATCH_SIZE of 5 -> **Preprocess: <1ms, NN inference ~2ms, Postprocess: ~<1ms (1000 samples)**
## 📑 Citation
If you use this software, please use the GitHub **“Cite this repository”** button at the top(-right) of this page.
## Environment
This repository is designed to run inside the Docker 🐳 container provided here:
[OpenCV-TRT-DEV](https://github.com/HenrikTrom/Docker-OpenCV-TensorRT-Dev)
It includes all necessary dependencies (CUDA, cuDNN, OpenCV, TensorRT, CMake).
### Prerequisites
In addition to the libraries installed in the container, this project relies on:
- 📦 [tensorrt-cpp-api (fork)](https://github.com/HenrikTrom/tensorrt-cpp-api)
*(Originally by [cyrusbehr](https://github.com/cyrusbehr/tensorrt-cpp-api))*
- 🧵 [cpp-utils](https://github.com/HenrikTrom/cpp_utils)
*(Handles multithreading, JSON config parsing, and utility tools)*
#### Environment Variables
Set the required variables (usually done via `.env` or your shell):
```bash
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets system-wide batch size
```
> If `N_CAMERAS` is not set, CMake will default to a batch size of **5**.
Use the `trt.sh` script in `./scripts` to convert your .onnx model to a fixed batch size.
#### Notes
* The batch size is treated as a **hardware constraint**, defined by the number of connected cameras.
* You can change the default batch size in `CMakeLists.txt` to fit your system.
* Although this repo is optimized for YOLOv8 models, you can modify the post-processing stage to support **any ONNX-compatible detection model**.
### Installation
Run the build and installation script:
```bash
sudo ./build_install.sh
```
This will configure the build system, compile the inference pipeline, and generate the binaries.
---
### Usage
Before using the pipeline, ensure the following:
### Environment Variables
These should be defined in your `.env` file or shell environment:
```bash
OPENCV_VERSION=4.10.0 # Your installed OpenCV version
N_CAMERAS=5 # Optional: sets batch size (defaults to 5)
```
> If `N_CAMERAS` is not set, the system assumes a default of **5** cameras.
---
### 🧠 Model Requirements
This repo is designed for trained [RTMPose](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmpose) models exported as `.onnx`.
The model must be exported with a **fixed batch size** matching your multi-camera setup.
CAdapt the configuration files in the `cfg/` folder to reflect your system and model setup.
You can change the default batch size in `CMakeLists.txt` if needed.
---
## Executables
### Benchmark
After configuring your setup:
```bash
./build/inference_benchmark
```
This runs the inference pipeline, processes multi-camera input, and saves images with overlayed bounding boxes and labels to the `inputs/` folder.
### Video Inference Export
This executable iterates over a directory of synchronized .mp4 videos and saves the result for each video in a .json file.
This example usage assumes .mp4 videos in an arbitrary `./test` directory
```bash
./build/video_inference_export test
```
### BBox Overlay
This executable iterates over a directory of synchronized .mp4 videos and exported inference results (from `./build/video_inference_export`). It generates new .mp4 videos with detections and a tiled video similar to the .gif in this readme.
This example usage assumes .mp4 videos and .json files in an arbitrary `./test` directory
```bash
./build/bbox_overlay test
```
---
## 📷 Applications
This inference module is optimized for:
* 3D multi-camera human pose estimation
* Online tracking and interaction
* Real-time robotics perception pipelines