https://github.com/orion-zhen/llm-throughput-eval

evaluate llm's generation speed via API
https://github.com/orion-zhen/llm-throughput-eval

llm llm-evaluation throughput throughput-performance

Last synced: 4 months ago
JSON representation

evaluate llm's generation speed via API

Host: GitHub
URL: https://github.com/orion-zhen/llm-throughput-eval
Owner: Orion-zhen
License: gpl-3.0
Created: 2024-10-09T07:04:57.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-05-10T15:05:07.000Z (about 1 year ago)
Last Synced: 2025-06-22T09:38:46.790Z (about 1 year ago)
Topics: llm, llm-evaluation, throughput, throughput-performance
Language: Python
Homepage:
Size: 35.2 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# llm-throughput-eval

Evaluate llm's generation speed via API

## Quick Start

### Get the repository

Clone:

```shell
git clone https://github.com/Orion-zhen/llm-throughput-eval.git
```

Install dependencies:

```shell
pip install -r requirements.txt
```

### Go

```shell
python evel.py -m -u -n -c -t
```

Example:

```shell
python eval.py -m "qwq:32b"
```

This will send 16 requests with 4 concurrency to local ollama API (should be ) with model qwq:32b

Full arguments:

```shell
usage: eval.py [-h] [--concurrency CONCURRENCY] [--requests REQUESTS] [--url URL] [--model MODEL]

Async HTTP Benchmark Tool

options:
-h, --help show this help message and exit
--concurrency, -c CONCURRENCY
Maximum concurrent requests.
--requests, -n REQUESTS
Total number of requests to send.
--url, -u URL Base URL for the API (e.g., http://localhost:8000). '/v1/chat/completions' will be appended.
--model, -m MODEL Model name to use in the request payload.
--token, -t TOKEN Bearer token for API authentication.
```

## Credits

The code is inspired by [this article](https://blog.csdn.net/arkohut/article/details/139076652)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/orion-zhen/llm-throughput-eval

Awesome Lists containing this project

README