https://github.com/stefanolusardi/tiny_inference_engine

Client/Server system to perform distributed inference on high load systems.
https://github.com/stefanolusardi/tiny_inference_engine

ai cmake conan cpp deep-neural-networks docker grpc inference-client inference-engine inference-server kserve onnxruntime

Last synced: 3 months ago
JSON representation

Client/Server system to perform distributed inference on high load systems.

Host: GitHub
URL: https://github.com/stefanolusardi/tiny_inference_engine
Owner: StefanoLusardi
License: mit
Created: 2021-11-28T20:04:58.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-01-23T23:21:59.000Z (over 2 years ago)
Last Synced: 2023-03-04T15:29:41.907Z (over 2 years ago)
Topics: ai, cmake, conan, cpp, deep-neural-networks, docker, grpc, inference-client, inference-engine, inference-server, kserve, onnxruntime
Language: C++
Homepage:
Size: 11 MB
Stars: 4
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# tiny inference engine
Welcome to **tiny inference engine**!
This repository contains a small Client/Server system to perform Machine Learning inference on high load systems.
This system allows to scale CPU & GPU resources serving multiple clients using a single server instance: it is mainly suitable for distributed scenarios and demanding applications.

## Server
The server application is written in modern, cross-platform C++ and can run on the 3 major platforms: Windows and Linux with both GPU and CPU support, while on MacOS only CPU is supported.
It is also possible to run the server in a Docker container.

## Client
The client is available as a library in order to be consumed by any application.
It is written in C++ and can run on Windows, Linux and MacOS.
See examples for more details.

### Communication Protocol: gRPC & HTTP
The inter process communication layer uses gRPC since it provides better performances over plain HTTP.
Because of this it is possible to make inference calls using only the client library.
The server however exposes also some HTTP endpoints to provide metrics and allow dynamic tuning at runtime so that other applications (e.g. curl) can be used to interact with it.

### Machine Learning Backend: ONNX Runtime
Currently only ONNX Runtime is supported as server backend.
The client is completely decoupled from the server backend: client applications must apply preprocessing in order to send properly formatted requests for the model required on server side.

---

## Requirements:
- Python3 (> 3.8)
- Conan (> 1.44)
- CMake (> 3.16)
- Ninja (> 1.9)
- C++17 compiler (see specific OS instruction)

### Ubuntu 20.04
- GCC (> 9.3.0)
- Clang (> 11.0.0)

```console
apt install python3.8 pip cmake ninja-build conan
pip install conan
```

### Windows 10
- Visual Studio 2019
```console
pip install conan
```

### MacOS
- Apple Clang (> 11.0.0)
- GCC (> 9.3.0)
```console
pipx install conan
```

## Build
```console
git clone https://github.com/StefanoLusardi/tiny_inference_engine
mkdir -p build && cd build
cmake -G Ninja -D CMAKE_BUILD_TYPE=Release ..
cmake --build . --config Release
cmake --install . --prefix ../install/
```

## Unit Tests
```console
cmake -G Ninja -D CMAKE_BUILD_TYPE=Release -D TIE_BUILD_CLIENT_UNIT_TESTS=ON -D TIE_BUILD_SERVER_UNIT_TESTS=ON ..
cmake --build . --config Release
ctest .
```

## Examples
```console
cmake -G Ninja -D CMAKE_BUILD_TYPE=Release -D TIE_BUILD_CLIENT_EXAMPLES=ON ..
cmake --build . --config Release
cmake --install . --prefix ../install/
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stefanolusardi/tiny_inference_engine

Awesome Lists containing this project

README