https://github.com/0xk1h0/onnx_grpc

Run ONNX model with gRPC & docker
https://github.com/0xk1h0/onnx_grpc

docker grpc grpc-python onnx

Last synced: 3 months ago
JSON representation

Run ONNX model with gRPC & docker

Host: GitHub
URL: https://github.com/0xk1h0/onnx_grpc
Owner: 0xk1h0
License: mit
Created: 2023-06-11T05:26:00.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-12-30T22:02:51.000Z (over 1 year ago)
Last Synced: 2025-01-23T06:26:26.722Z (5 months ago)
Topics: docker, grpc, grpc-python, onnx
Language: Jupyter Notebook
Homepage:
Size: 104 KB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        ### Distributed System Topics

### ECE5654_41

### Serving Machine Learning Model using ONNX with gPRC!

---

[![https://youtu.be/naYlCwZX6Mg/0.jpg](https://i.ytimg.com/vi/naYlCwZX6Mg/hqdefault.jpg?sqp=-oaymwE2CNACELwBSFXyq4qpAygIARUAAIhCGAFwAcABBvABAfgB4gWAAuADigIMCAAQARhBIFkoZTAP&rs=AOn4CLChSIAINtrnMhZJ1GYUole-Nh6Fig)](https://youtu.be/naYlCwZX6Mg)

---

- Environment : MacBook-Pro / Kernel Version 22.2.0 T6000 arm64 / Python 3.11.2

1. About ONNX

   - ONNX is an extension of the Open Neural Network Exchange, an open ecosystem that enables AI developers to choose the right tools as their projects evolve. ONNX provides an open source format for AI models, both deep learning and traditional ML. All deep learning libraries can use ONNX to convert to Tensorflow so they can use Tensorflow serving, but what about traditional machine learning like a tree based algorithm? Although it can be converted to ONNX, but tree based algorithm coming from xgboost or sklearn still cannot be converted to deep learning library (maybe in the future).

2. Requirements

  - numpy

  - scikit-learn

  - xgboost

  - onnxmltools

  - onnx

  - grpcio

  - onnxruntime

Based on that requirements, we need onnx runtime to run onnx inference, even though onnx has compatibility with some runtime like GraphPipe from Oracle which uses flatbuffers or Nvidia with its tensor rt and many more but what I will discuss here is the onnx runtime that comes from onnx itself.

Real world learning machines need more than a single inference. We need low latency in online or mini-batch inferences. In this tutorial I use dataset digits and xgboost to get the model.

The first step that must be done is to start the docker container from the onnx runtime server, port 9001 for http and 50051 for grpc while model_path is model_path in the docker

GRPC is available on localhost and port 50051. To call service methods, we first need to create a stub. We instantiate the PredictionServiceStub class of the prediction_service_pb2_grpc module from prediction_service.proto.

For RPC methods that return a single response (“response-unary” methods), gRPC Python supports both synchronous (blocking) and asynchronous (non-blocking) control flow semantics A synchronous call to the simple RPC Predict is nearly as straightforward as calling a local method. The RPC call waits for the server to respond, and will either return a response or raise an exception:

### How to usage.

0. `git clone` this repository.

1. Run `train.ipynb` to generate onnx format model file.

2. Run `run_ort_server.sh`

   1. You need to make sure you already have Docker installed.

3. Run `grpc_client_example.py` 

   1. It would be return the inference results (Label & Score).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/0xk1h0/onnx_grpc

Awesome Lists containing this project

README