https://github.com/tcapelle/bench_llms

A simple benchmark of LLM speed
https://github.com/tcapelle/bench_llms

Last synced: 3 months ago
JSON representation

A simple benchmark of LLM speed

Host: GitHub
URL: https://github.com/tcapelle/bench_llms
Owner: tcapelle
Created: 2024-08-21T13:21:49.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-08-27T16:34:08.000Z (11 months ago)
Last Synced: 2025-03-27T05:51:09.171Z (4 months ago)
Language: Python
Size: 595 KB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# LLM Benchmark Tool
![](./compare.png)

This tool benchmarks the performance of various Large Language Models (LLMs) across different providers.

## Features

- Supports multiple LLM providers (Octo, Groq, Fireworks, Together)
- Measures real-time performance, prompt tokens per second, and completion tokens per second
- Configurable number of runs and custom prompts
- Uses Weave for evaluation and scoring

## Setup

1. Clone the repository:
```
git clone https://github.com/your-username/llm-benchmark-tool.git
cd llm-benchmark-tool
```

2. Install the required dependencies:
```
pip install -r requirements.txt
```

3. Set up environment variables for API keys:
```
export OCTO_API_KEY=your_octo_api_key
export GROQ_API_KEY=your_groq_api_key
export FIREWORKS_API_KEY=your_fireworks_api_key
export CEREBRAS_API_KEY=your_cerebras_api_key
export TOGETHER_API_KEY=your_together_api_key
```

## Usage

Run the benchmark with default settings:

```
python benchmark.py
```

Customize the benchmark:

```
python benchmark.py --provider fireworks --prompt "Explain quantum computing" --n 5
```

## Project Structure

- `benchmark.py`: Main script for running the benchmark
- `providers.py`: Definitions of LLM providers
- `requirements.txt`: List of Python dependencies
- `README.md`: This file

## Adding a New Provider

To add a new LLM provider to the benchmark tool, follow these steps:

1. Open the `providers.py` file.

2. Add a new entry to the `PROVIDERS` dictionary with the following format:

```python
"provider_name": Provider(
client=openai.OpenAI(
base_url="https://api.provider.com/v1",
api_key=os.environ.get("PROVIDER_API_KEY")
),
model="provider-model-name"
)
```

Replace `provider_name`, `https://api.provider.com/v1`, `PROVIDER_API_KEY`, and `provider-model-name` with the appropriate values for the new provider.

3. Add the new provider's API key to your environment variables:

```
export PROVIDER_API_KEY=your_provider_api_key
```

4. Update the README to include the new provider in the list of supported providers and add the new environment variable to the setup instructions.

5. Test the new provider by running the benchmark with the new provider name:

```
python benchmark.py --provider provider_name
```

Remember to ensure that the new provider's API is compatible with the OpenAI client format used in this tool. If it's not, you may need to modify the `call_llama` function in `benchmark.py` to handle any differences in the API structure.

## Visualizing Results

You can visualize the results with Weights & Biases [Weave](https://wandb.github.io/weave/):

[![](./compare.png)](https://wandb.ai/capecape/benchmark_llama_70b/weave/compare-evaluations?evaluationCallIds=%5B%220191753f-5423-7003-b885-2a50dceb993b%22%2C%220191753e-cab1-75d3-8282-94dc5628d3a9%22%2C%220191753e-283e-7033-b0ac-86e646195f1e%22%2C%220191753d-640c-7053-86bd-72eb54949d07%22%5D)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tcapelle/bench_llms

Awesome Lists containing this project

README