https://github.com/goceleris/benchmarks
Official reproducible benchmark suite for comparing Go HTTP server throughput and latency. Tests production frameworks against theoretical maximum implementations using raw syscalls.
https://github.com/goceleris/benchmarks
benchmarks docker latency load-testing performance throughput
Last synced: 2 months ago
JSON representation
Official reproducible benchmark suite for comparing Go HTTP server throughput and latency. Tests production frameworks against theoretical maximum implementations using raw syscalls.
- Host: GitHub
- URL: https://github.com/goceleris/benchmarks
- Owner: goceleris
- License: apache-2.0
- Created: 2026-01-02T12:57:35.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-03-25T17:20:35.000Z (2 months ago)
- Last Synced: 2026-03-25T22:59:59.551Z (2 months ago)
- Topics: benchmarks, docker, latency, load-testing, performance, throughput
- Language: Go
- Homepage: https://goceleris.dev/
- Size: 25.5 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Celeris Benchmarks
Reproducible HTTP server benchmarks on dedicated bare-metal hardware with 10GbE point-to-point networking. Compares production Go frameworks against theoretical maximum performance using raw Linux syscalls.
## Why This Exists
Most HTTP benchmarks run on shared VMs with noisy neighbors, variable network hops, and throttled I/O — making results unreliable and non-reproducible. This suite runs on dedicated bare-metal machines with direct 10GbE links, automated kernel tuning, and CPU pinning, so every release gets consistent, comparable numbers.
We measure three categories of servers:
- **Baseline**: Production Go frameworks (Gin, Fiber, Echo, Chi, Iris, Hertz, FastHTTP, stdlib)
- **Celeris**: The [Celeris](https://github.com/goceleris/celeris) HTTP engine with io_uring, epoll, and adaptive backends
- **Theoretical**: Raw epoll/io_uring implementations showing the syscall performance ceiling
## Hardware
Three dedicated Minisforum mini PCs connected via 10GbE point-to-point links:
| Machine | Role | CPU | Cores/Threads | RAM | Network |
|---------|------|-----|---------------|-----|---------|
| MS-A2 | Client (self-hosted runner) | AMD Ryzen 9 9955HX (Zen 5) | 16C/32T | 32 GB DDR5 | 10GbE SFP+ |
| MS-A2 | x86 Server | AMD Ryzen 7 7745HX (Zen 4) | 8C/16T | 32 GB DDR5 | 10GbE SFP+ |
| MS-R1 | ARM64 Server | CIX CP8180 | 12C/12T | 64 GB LPDDR5 | Dual 10GbE RJ45 (RTL8127) |
All machines run Debian 13 (Trixie) with kernel 6.12+ for full io_uring support. The client machine is the GitHub Actions self-hosted runner that orchestrates everything via SSH.
## Benchmark Types
### Standard Level (7 types, ~66 min per architecture)
| Type | Endpoint | What It Tests |
|------|----------|---------------|
| `simple` | `GET /` | Plain text — pure framework overhead |
| `json` | `GET /json` | JSON serialization |
| `path` | `GET /users/:id` | Path parameter extraction + routing |
| `body` | `POST /upload` | 2 KB request body read |
| `headers` | `GET /users/:id` | Realistic API headers (~850 bytes: JWT, cookies, tracing) |
| `json-64k` | `GET /json-64k` | 64 KB JSON response — I/O throughput, efficiency metric |
| `churn` | `GET /` | New TCP connection per request — tests `accept()`, `SO_REUSEPORT` |
### Full Level (15 types, ~142 min per architecture)
Adds a **concurrency sweep** that scales connections from 1 to 10,000 on the `simple` endpoint:
```
simple@1 simple@10 simple@50 simple@100 simple@500 simple@1000 simple@5000 simple@10000
```
This produces scaling curves that show where goroutine-based frameworks plateau and where event-loop servers keep climbing.
## Servers Tested
### Production Frameworks (Baseline)
| Server | Protocols | Framework |
|--------|-----------|-----------|
| stdhttp | H1, H2C, Hybrid | Go stdlib `net/http` |
| gin | H1, H2C, Hybrid | [Gin](https://github.com/gin-gonic/gin) |
| echo | H1, H2C, Hybrid | [Echo](https://github.com/labstack/echo) |
| chi | H1, H2C, Hybrid | [Chi](https://github.com/go-chi/chi) |
| iris | H1, H2C, Hybrid | [Iris](https://github.com/kataras/iris) |
| hertz | H1, H2C, Hybrid | [Hertz](https://github.com/cloudwego/hertz) |
| fiber | H1 | [Fiber](https://github.com/gofiber/fiber) (fasthttp-based) |
| fasthttp | H1 | [FastHTTP](https://github.com/valyala/fasthttp) |
### Celeris
| Server | Protocols | Engine |
|--------|-----------|--------|
| celeris-iouring | H1, H2C, Hybrid | io_uring (Linux 5.10+) |
| celeris-epoll | H1, H2C, Hybrid | epoll (Linux 2.6+) |
| celeris-adaptive | H1, H2C, Hybrid | Runtime engine selection |
Each engine runs with three resource profiles: `latency`, `throughput`, and `balanced`.
### Theoretical Maximum
| Server | Protocols | Implementation |
|--------|-----------|----------------|
| epoll | H1, H2C, Hybrid | Raw epoll with SO_REUSEPORT, SIMD header parsing, zero-alloc response path |
| iouring | H1, H2C, Hybrid | io_uring with SQPOLL, multishot accept, linked SQEs |
## Dashboard & Results
Results are published to [goceleris/docs](https://github.com/goceleris/docs) as dashboard-format JSON (schema v4.0), keyed by Celeris version:
- `results/latest/{arch}.json` — most recent run
- `results/{version}/{arch}.json` — per-version archive
Dashboard data includes:
- **RPS and latency percentiles** (P50, P75, P90, P99, P999, P9999) per server per benchmark type
- **Concurrency scaling curves** — RPS at each concurrency level (full level only)
- **Efficiency metric** — RPS / Server CPU% per server, normalizing across core counts
- **System metrics** — server CPU, memory RSS, GC pauses (Go servers only)
- **Timeseries** — per-second RPS and P99 latency snapshots
## Running Benchmarks
Benchmarks are designed to run through GitHub Actions workflows. The self-hosted runner on the client machine handles everything: SSH into servers, deploy binaries, tune kernels, run benchmarks, collect results.
### Via GitHub Actions (Primary Method)
- **Release benchmarks**: Trigger automatically on every release, or manually via the `benchmark.yml` workflow dispatch. Releases run at `full` level (includes concurrency sweep).
- **PR benchmarks**: Add the `benchmark` label to a pull request. Runs at `standard` level.
### Local Development
For local development and testing (not full benchmarks):
```bash
# Build server and bench binaries
mage build
# Run a quick local smoke test (5s per server, localhost)
mage benchmarkQuick
```
## CI/CD
| Workflow | Trigger | Level | Timeout |
|----------|---------|-------|---------|
| `benchmark.yml` | Release (auto) or manual dispatch | `full` on release, configurable on manual | 480 min |
| `benchmark-pr.yml` | PR with `benchmark` label | `standard` | 240 min |
Both workflows SSH to the bare-metal servers, deploy the server binary, run benchmarks, and collect results. Release runs also publish to the docs repository and trigger a site rebuild.
## Project Structure
```
cmd/bench/ Benchmark runner CLI (specs, runner, checkpoint)
cmd/server/ Server binary (all implementations + control daemon)
servers/
baseline/ Production frameworks (gin, echo, chi, iris, etc.)
celeris/ Celeris HTTP engine
theoretical/ Raw epoll/iouring implementations
common/ Shared types, payload generators, SIMD helpers
internal/
dashboard/ Dashboard JSON format (schema v4.0)
metrics/ Prometheus metrics definitions
version/ Version info
config/
hosts.json Machine addresses and hardware metadata
```
## Contributing
### Requirements
- **Go 1.24+**: [Download](https://go.dev/dl/)
- **Mage**: `go install github.com/magefile/mage@latest`
### Development
```bash
mage check # deps + lint + vet + build
mage test # run tests
mage fmt # format code
```
### Adding a Server
1. Create a package under `servers/baseline/` (or `servers/theoretical/`)
2. Implement all benchmark endpoints: `GET /`, `GET /json`, `GET /json-1k`, `GET /json-64k`, `GET /users/:id`, `POST /upload`
3. Register the server type in `cmd/server/main.go`
4. Add to the server list in `cmd/bench/main.go`
### Adding a Benchmark Type
1. Add the endpoint to all server implementations
2. Add a `BenchmarkSpec` entry in `cmd/bench/main.go`
3. Update dashboard format if new fields are needed (`internal/dashboard/format.go`)
## License
Apache 2.0