Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tensorchord/inference-benchmark
Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
https://github.com/tensorchord/inference-benchmark
benchmark inference-server llm stable-diffusion whisper
Last synced: about 1 month ago
JSON representation
Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)
- Host: GitHub
- URL: https://github.com/tensorchord/inference-benchmark
- Owner: tensorchord
- Created: 2023-06-12T09:39:42.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-06-28T11:28:22.000Z (over 1 year ago)
- Last Synced: 2024-02-25T12:34:19.449Z (10 months ago)
- Topics: benchmark, inference-server, llm, stable-diffusion, whisper
- Language: Python
- Homepage:
- Size: 46.9 KB
- Stars: 23
- Watchers: 6
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Inference Benchmark
Maximize the potential of your models with the inference benchmark (tool).
# What is it
Inference benchmark provides a standard way to measure the performance of inference workloads. It is also a tool that allows you to evaluate and optimize the performance of your inference workloads.
# Results
## Bert
We benchmarked [pytriton (triton-inference-server)](https://github.com/triton-inference-server/pytriton) and [mosec](https://github.com/mosecorg/mosec) with bert. We enabled dynamic batching for both frameworks with max batch size 32 and max wait time 10ms. Please checkout the [result](./benchmark/results/bert.md) for more details.
![DistilBert](./benchmark/results/distilbert_serving_benchmark.png)
More [results with different models on different serving frameworks](https://github.com/tensorchord/inference-benchmark/issues/7) are coming soon.