https://github.com/goceleris/benchmarks

Official reproducible benchmark suite for comparing Go HTTP server throughput and latency. Tests production frameworks against theoretical maximum implementations using raw syscalls.
https://github.com/goceleris/benchmarks

benchmarks docker latency load-testing performance throughput

Last synced: 3 months ago
JSON representation

Official reproducible benchmark suite for comparing Go HTTP server throughput and latency. Tests production frameworks against theoretical maximum implementations using raw syscalls.

Host: GitHub
URL: https://github.com/goceleris/benchmarks
Owner: goceleris
License: apache-2.0
Created: 2026-01-02T12:57:35.000Z (6 months ago)
Default Branch: main
Last Pushed: 2026-03-25T17:20:35.000Z (3 months ago)
Last Synced: 2026-03-25T22:59:59.551Z (3 months ago)
Topics: benchmarks, docker, latency, load-testing, performance, throughput
Language: Go
Homepage: https://goceleris.dev/
Size: 25.5 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Celeris Benchmarks

Reproducible HTTP server benchmarks on dedicated bare-metal hardware with 10GbE point-to-point networking. Compares production Go frameworks against theoretical maximum performance using raw Linux syscalls.

## Why This Exists

Most HTTP benchmarks run on shared VMs with noisy neighbors, variable network hops, and throttled I/O — making results unreliable and non-reproducible. This suite runs on dedicated bare-metal machines with direct 10GbE links, automated kernel tuning, and CPU pinning, so every release gets consistent, comparable numbers.

We measure three categories of servers:

- **Baseline**: Production Go frameworks (Gin, Fiber, Echo, Chi, Iris, Hertz, FastHTTP, stdlib)

- **Celeris**: The [Celeris](https://github.com/goceleris/celeris) HTTP engine with io_uring, epoll, and adaptive backends

- **Theoretical**: Raw epoll/io_uring implementations showing the syscall performance ceiling

## Hardware

Three dedicated Minisforum mini PCs connected via 10GbE point-to-point links:

| Machine | Role | CPU | Cores/Threads | RAM | Network |

|---------|------|-----|---------------|-----|---------|

| MS-A2 | Client (self-hosted runner) | AMD Ryzen 9 9955HX (Zen 5) | 16C/32T | 32 GB DDR5 | 10GbE SFP+ |

| MS-A2 | x86 Server | AMD Ryzen 7 7745HX (Zen 4) | 8C/16T | 32 GB DDR5 | 10GbE SFP+ |

| MS-R1 | ARM64 Server | CIX CP8180 | 12C/12T | 64 GB LPDDR5 | Dual 10GbE RJ45 (RTL8127) |

All machines run Debian 13 (Trixie) with kernel 6.12+ for full io_uring support. The client machine is the GitHub Actions self-hosted runner that orchestrates everything via SSH.

## Benchmark Types

### Standard Level (7 types, ~66 min per architecture)

| Type | Endpoint | What It Tests |

|------|----------|---------------|

| `simple` | `GET /` | Plain text — pure framework overhead |

| `json` | `GET /json` | JSON serialization |

| `path` | `GET /users/:id` | Path parameter extraction + routing |

| `body` | `POST /upload` | 2 KB request body read |

| `headers` | `GET /users/:id` | Realistic API headers (~850 bytes: JWT, cookies, tracing) |

| `json-64k` | `GET /json-64k` | 64 KB JSON response — I/O throughput, efficiency metric |

| `churn` | `GET /` | New TCP connection per request — tests `accept()`, `SO_REUSEPORT` |

### Full Level (15 types, ~142 min per architecture)

Adds a **concurrency sweep** that scales connections from 1 to 10,000 on the `simple` endpoint:

```

simple@1  simple@10  simple@50  simple@100  simple@500  simple@1000  simple@5000  simple@10000

```

This produces scaling curves that show where goroutine-based frameworks plateau and where event-loop servers keep climbing.

## Servers Tested

### Production Frameworks (Baseline)

| Server | Protocols | Framework |

|--------|-----------|-----------|

| stdhttp | H1, H2C, Hybrid | Go stdlib `net/http` |

| gin | H1, H2C, Hybrid | [Gin](https://github.com/gin-gonic/gin) |

| echo | H1, H2C, Hybrid | [Echo](https://github.com/labstack/echo) |

| chi | H1, H2C, Hybrid | [Chi](https://github.com/go-chi/chi) |

| iris | H1, H2C, Hybrid | [Iris](https://github.com/kataras/iris) |

| hertz | H1, H2C, Hybrid | [Hertz](https://github.com/cloudwego/hertz) |

| fiber | H1 | [Fiber](https://github.com/gofiber/fiber) (fasthttp-based) |

| fasthttp | H1 | [FastHTTP](https://github.com/valyala/fasthttp) |

### Celeris

| Server | Protocols | Engine |

|--------|-----------|--------|

| celeris-iouring | H1, H2C, Hybrid | io_uring (Linux 5.10+) |

| celeris-epoll | H1, H2C, Hybrid | epoll (Linux 2.6+) |

| celeris-adaptive | H1, H2C, Hybrid | Runtime engine selection |

Each engine runs with three resource profiles: `latency`, `throughput`, and `balanced`.

### Theoretical Maximum

| Server | Protocols | Implementation |

|--------|-----------|----------------|

| epoll | H1, H2C, Hybrid | Raw epoll with SO_REUSEPORT, SIMD header parsing, zero-alloc response path |

| iouring | H1, H2C, Hybrid | io_uring with SQPOLL, multishot accept, linked SQEs |

## Dashboard & Results

Results are published to [goceleris/docs](https://github.com/goceleris/docs) as dashboard-format JSON (schema v4.0), keyed by Celeris version:

- `results/latest/{arch}.json` — most recent run

- `results/{version}/{arch}.json` — per-version archive

Dashboard data includes:

- **RPS and latency percentiles** (P50, P75, P90, P99, P999, P9999) per server per benchmark type

- **Concurrency scaling curves** — RPS at each concurrency level (full level only)

- **Efficiency metric** — RPS / Server CPU% per server, normalizing across core counts

- **System metrics** — server CPU, memory RSS, GC pauses (Go servers only)

- **Timeseries** — per-second RPS and P99 latency snapshots

## Running Benchmarks

Benchmarks are designed to run through GitHub Actions workflows. The self-hosted runner on the client machine handles everything: SSH into servers, deploy binaries, tune kernels, run benchmarks, collect results.

### Via GitHub Actions (Primary Method)

- **Release benchmarks**: Trigger automatically on every release, or manually via the `benchmark.yml` workflow dispatch. Releases run at `full` level (includes concurrency sweep).

- **PR benchmarks**: Add the `benchmark` label to a pull request. Runs at `standard` level.

### Local Development

For local development and testing (not full benchmarks):

```bash

# Build server and bench binaries

mage build

# Run a quick local smoke test (5s per server, localhost)

mage benchmarkQuick

```

## CI/CD

| Workflow | Trigger | Level | Timeout |

|----------|---------|-------|---------|

| `benchmark.yml` | Release (auto) or manual dispatch | `full` on release, configurable on manual | 480 min |

| `benchmark-pr.yml` | PR with `benchmark` label | `standard` | 240 min |

Both workflows SSH to the bare-metal servers, deploy the server binary, run benchmarks, and collect results. Release runs also publish to the docs repository and trigger a site rebuild.

## Project Structure

```

cmd/bench/          Benchmark runner CLI (specs, runner, checkpoint)

cmd/server/         Server binary (all implementations + control daemon)

servers/

  baseline/         Production frameworks (gin, echo, chi, iris, etc.)

  celeris/          Celeris HTTP engine

  theoretical/      Raw epoll/iouring implementations

  common/           Shared types, payload generators, SIMD helpers

internal/

  dashboard/        Dashboard JSON format (schema v4.0)

  metrics/          Prometheus metrics definitions

  version/          Version info

config/

  hosts.json        Machine addresses and hardware metadata

```

## Contributing

### Requirements

- **Go 1.24+**: [Download](https://go.dev/dl/)

- **Mage**: `go install github.com/magefile/mage@latest`

### Development

```bash

mage check    # deps + lint + vet + build

mage test     # run tests

mage fmt      # format code

```

### Adding a Server

1. Create a package under `servers/baseline/` (or `servers/theoretical/`)

2. Implement all benchmark endpoints: `GET /`, `GET /json`, `GET /json-1k`, `GET /json-64k`, `GET /users/:id`, `POST /upload`

3. Register the server type in `cmd/server/main.go`

4. Add to the server list in `cmd/bench/main.go`

### Adding a Benchmark Type

1. Add the endpoint to all server implementations

2. Add a `BenchmarkSpec` entry in `cmd/bench/main.go`

3. Update dashboard format if new fields are needed (`internal/dashboard/format.go`)

## License

Apache 2.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/goceleris/benchmarks

Awesome Lists containing this project

README