Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tensorchord/inference-benchmark

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
https://github.com/tensorchord/inference-benchmark

benchmark inference-server llm stable-diffusion whisper

Last synced: about 1 month ago
JSON representation

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)

Awesome Lists containing this project

README

        

# Inference Benchmark

Maximize the potential of your models with the inference benchmark (tool).


discord invitation link
trackgit-views

# What is it

Inference benchmark provides a standard way to measure the performance of inference workloads. It is also a tool that allows you to evaluate and optimize the performance of your inference workloads.

# Results

## Bert

We benchmarked [pytriton (triton-inference-server)](https://github.com/triton-inference-server/pytriton) and [mosec](https://github.com/mosecorg/mosec) with bert. We enabled dynamic batching for both frameworks with max batch size 32 and max wait time 10ms. Please checkout the [result](./benchmark/results/bert.md) for more details.

![DistilBert](./benchmark/results/distilbert_serving_benchmark.png)

More [results with different models on different serving frameworks](https://github.com/tensorchord/inference-benchmark/issues/7) are coming soon.