https://github.com/dnth/deployment-frameworks

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/dnth/deployment-frameworks
Owner: dnth
Created: 2024-11-19T04:05:23.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-11-28T06:26:08.000Z (over 1 year ago)
Last Synced: 2025-11-11T11:32:27.588Z (8 months ago)
Language: Python
Size: 24.4 KB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Model Deployment Frameworks
This repo provides a simple example of how to deploy a model using different frameworks.

## Triton Inference Server
Export the model into the `model_repository` directory.

```bash
cd triton_inference_server/image_classification_resnet50
```

```bash
python export_tv_model.py
```

Create a config.pbtxt file in the `model_repository/resnet50/` directory with the following content:

```pbtxt
name: "resnet50"
backend: "pytorch"
max_batch_size: 8
input [
{
name: "input__0"
data_type: TYPE_FP32
dims: [ 3, 224, 224 ]
}
]
output [
{
name: "output__0"
data_type: TYPE_FP32
dims: [ 1000 ]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
}
]
```

Run the Triton Inference Server with the ResNet50 model.

```bash
cd triton_inference_server/image_classification_resnet50
```

```bash
docker run --gpus all --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:23.01-py3 tritonserver --model-repository=/models
```

Query the model.

```bash
cd triton_inference_server/image_classification_resnet50
```
With a single image inference.
```bash
python single_inference.py
```
Or batch inference.
```bash
python batch_inference.py
```

## Ray Serve
Launch the Ray application.

```bash
cd ray_serve
```

```bash
python deploy.py
```

Query the model.

```bash
cd ray_serve
```

```bash
python query.py
```

## Ray + Triton

WIP

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dnth/deployment-frameworks

Awesome Lists containing this project

README