https://github.com/bykologlu/ollama-benchmark-cli

Benchmark tool for evaluating local LLMs like Mistral and DeepSeek using custom prompts.
https://github.com/bykologlu/ollama-benchmark-cli

ai-tools benchmark cli go golang llm-training local-ai ollama-app ollama-client ollama-gui open-source

Last synced: about 1 month ago
JSON representation

Benchmark tool for evaluating local LLMs like Mistral and DeepSeek using custom prompts.

Host: GitHub
URL: https://github.com/bykologlu/ollama-benchmark-cli
Owner: bykologlu
License: mit
Created: 2025-04-05T22:59:40.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2025-04-05T23:56:19.000Z (about 1 month ago)
Last Synced: 2025-04-06T00:24:08.404Z (about 1 month ago)
Topics: ai-tools, benchmark, cli, go, golang, llm-training, local-ai, ollama-app, ollama-client, ollama-gui, open-source
Language: Go
Homepage:
Size: 136 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 🎢 Ollama Benchmark CLI

A command-line tool for evaluating and comparing local LLM models using custom prompts and source code inputs. Ideal for developers testing model quality and performance in real-world scenarios.

## 🖼️ Screenshots

![Benchmark Screenshot](assets/screenshot-1.png)

![Benchmark Screenshot](assets/screenshot-2.png)

## 🚀 Features

- Multi-model benchmarking (Mistral, DeepSeek, Phi, etc.)

- Detailed results with export support

- Supports JSON, TXT and comparison formats

- Multi-language interface (English & Turkish)

- Docker-ready for easy deployment

## 🐳 Run with Docker

```bash
# Build the Docker image
docker build -t ollama-benchmark .
```

```bash
# Run the CLI tool
docker run -it --rm \
-v $PWD/output:/root/output \
-v $PWD/prompts.txt:/root/prompts.txt \
ollama-benchmark
```
> Benchmark results will be saved inside the `output/` directory.

## 💻 Run without Docker
You can also run the CLI tool directly on your machine.
### On Windows:
```bash
go build -o ollama-benchmark.exe main.go
ollama-benchmark.exe
```
### On Linux:
```bash
go build -o ollama-benchmark main.go
./ollama-benchmark
```
Make sure **prompts.txt** and **internal/i18n/lang.json** files are present in the correct location.

## 🚀 Parameters

You can pass options like this:
```bash
ollama-benchmark --model mistral --input code.py --lang en
```

## 📁 Project Structure

- `internal/` → Source code

- `prompts.txt` → Prompt input list

- `output/` → Benchmark output directory

- `main.go` → CLI entry point

## 🧑‍💻 Development

- Requires Go 1.21+
- Fully containerized with Docker
- Designed for extensibility and easy testing

## 🤝 Contributing

Feel free to open issues or pull requests for feature requests or improvements.

## 📜 License

MIT License

## 🧑‍💻 How to Use

When you run the CLI, you will be prompted to select a language and then choose between:
1) Quick Start
2) Custom Settings

### ⚡ 1. Quick Start Mode
The fastest way to run benchmarks using default values.
- Uses default Ollama API URL: http://localhost:11434
- Loads prompts from: prompts.txt
- Runs each model once (1 trial)
- Outputs results in .txt format
- Shows results as table and logs performance

**How to run (after Docker build):**
```bash
docker run -it --rm \
-v $PWD/output:/root/output \
-v $PWD/prompts.txt:/root/prompts.txt \
ollama-benchmark
```

### ⚙️ 2. Custom Settings Mode
Allows fine-tuning the benchmark process. You'll be prompted for:
- Ollama API URL
- Prompt file path (default: `prompts.txt`)
- Number of trials (default: 3)
- Output format (`csv`, `json`, or `txt`)
- Tokens-only view? (`y/n`)
- Model selection (single or all)

Each model will be tested multiple times and results will be saved accordingly.

### 📊 Output Files
- benchmark_summary_result_ : All benchmark results summary

- benchmark_detail__ : Per-model detailed results

- benchmark_summary_comparison_ : Multi-model comparison (if >1 used)

- performance.log : Internal performance log (append)

All outputs are saved inside the /output folder.

### 📌 CLI Logic Summary
- Language support: English & Turkish
- Interactive terminal input
- Benchmark logic: via **internal/benchmark**
- Model list fetched from local Ollama API
- Prompt source: plain text (**prompts.txt**, line by line)
- Modular output system (**csv, json, txt**, comparison)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bykologlu/ollama-benchmark-cli

Awesome Lists containing this project

README