Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tensorchord/inference-benchmark

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
https://github.com/tensorchord/inference-benchmark

benchmark inference-server llm stable-diffusion whisper

Last synced: 2 months ago
JSON representation

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)

Host: GitHub
URL: https://github.com/tensorchord/inference-benchmark
Owner: tensorchord
Created: 2023-06-12T09:39:42.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-06-28T11:28:22.000Z (over 1 year ago)
Last Synced: 2024-02-25T12:34:19.449Z (11 months ago)
Topics: benchmark, inference-server, llm, stable-diffusion, whisper
Language: Python
Homepage:
Size: 46.9 KB
Stars: 23
Watchers: 6
Forks: 3
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        


# Inference Benchmark

Maximize the potential of your models with the inference benchmark (tool).











# What is it

Inference benchmark provides a standard way to measure the performance of inference workloads. It is also a tool that allows you to evaluate and optimize the performance of your inference workloads.

# Results

## Bert

We benchmarked [pytriton (triton-inference-server)](https://github.com/triton-inference-server/pytriton) and [mosec](https://github.com/mosecorg/mosec) with bert. We enabled dynamic batching for both frameworks with max batch size 32 and max wait time 10ms. Please checkout the [result](./benchmark/results/bert.md) for more details.

![DistilBert](./benchmark/results/distilbert_serving_benchmark.png)

More [results with different models on different serving frameworks](https://github.com/tensorchord/inference-benchmark/issues/7) are coming soon.