An open API service indexing awesome lists of open source software.

https://github.com/nethermindeth/gas-benchmarks

Gas benchmark research repository
https://github.com/nethermindeth/gas-benchmarks

Last synced: about 1 month ago
JSON representation

Gas benchmark research repository

Awesome Lists containing this project

README

          

# Gas Benchmarks

This repository contains scripts to run benchmarks across multiple clients.
Follow the instructions below to run the benchmarks locally.

## Prerequisites

Make sure you have the following installed on your system:

- Python 3.10
- Docker
- Docker Compose
- .NET 8.0.x
- `make` (for running make commands)
- `git lfs`, `zip` (additional tools)

## Setup

1. **Clone the repository:**

```sh
git clone https://github.com/nethermindeth/gas-benchmarks.git
cd gas-benchmarks
```

2. **Install Python dependencies:**

```sh
pip install -r requirements.txt
```

2.1.**Install additional tools:**

```sh
git lfs install
git lfs pull
sudo apt install zip
```

3. **Prepare Kute dependencies (specific to Nethermind):**

```sh
make prepare_tools
```

4. **Create a results directory:**

```sh
mkdir -p results
```

## Running the Benchmarks

### Script: Run all

For running the whole pipeline, you can use the `run.sh` script.

```sh
bash run.sh -t "testsPath" -w "warmupFilePath" -c "client1,client2" -r runNumber -i "image1,image2"
```

Example run:
```shell
run.sh -t "eest_tests/" -w "warmup-tests" -c "nethermind,geth,reth" -r 8
```

Flags:
- `--t` it's used to define the path where the tests are located.
- `--w` it's used to define the path where the warmup file is located.
- `--c` it's used to define the clients that you want to run the benchmarks. Separate the clients with a comma.
- `--r` it's used to define the number of iterations that you want to run the benchmarks. It's a numeric value.
- `--i` it's used to define the images that you want to use to run the benchmarks. Separate the images with a comma, and match the clients. Use `default` if you want to ignore the values.

Now you're ready to run the benchmarks locally!

## Populating the PostgreSQL Database with Benchmark Data

After running benchmarks and generating report files, you can populate a PostgreSQL database with the results for further analysis. This process involves two main scripts: `generate_postgres_schema.py` to set up the database table, and `fill_postgres_db.py` to load the data.

### 1. Setting up the Database Schema

The `generate_postgres_schema.py` script creates the necessary table in your PostgreSQL database to store the benchmark data.

**Usage:**

```sh
python generate_postgres_schema.py \
--db-host \
--db-port \
--db-user \
--db-name \
--table-name \
--log-level
```

- You will be prompted to enter the password for the specified database user.
- `--table-name`: Defaults to `benchmark_data`.
- `--log-level`: Defaults to `INFO`.

**Example:**

```sh
python generate_postgres_schema.py \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-name benchmarks \
--table-name gas_benchmark_results
```

This will create a table named `gas_benchmark_results` (if it doesn't already exist) in the `benchmarks` database.

### 2. Populating the Database with Benchmark Data

Once the schema is set up, use `fill_postgres_db.py` to parse the benchmark report files (generated by `run.sh` or other means) and insert the data into the PostgreSQL table.

**Usage:**

```sh
python fill_postgres_db.py \
--reports-dir \
--db-host \
--db-port \
--db-user \
--db-password \
--db-name \
--table-name \
--log-level
```

- `--reports-dir`: Path to the directory containing the benchmark output files (e.g., `output_*.csv`, `raw_results_*.csv`, and `index.html` or `computer_specs.txt`).
- `--db-password`: The password for the database user.
- `--table-name`: Should match the table name used with `generate_postgres_schema.py`. Defaults to `benchmark_data`.
- `--log-level`: Defaults to `INFO`.

**Example:**

```sh
python fill_postgres_db.py \
--reports-dir ./results/my_benchmark_run_01 \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-password "securepassword123" \
--db-name benchmarks \
--table-name gas_benchmark_results
```

This script will scan the specified reports directory, parse the client benchmark data and computer specifications, and insert individual run records into the `gas_benchmark_results` table.

## Continuous Metrics Posting

A new script `run_and_post_metrics.sh` is available to run the benchmarks and post metrics continuously in an infinite loop.
This script updates the local repository with `git pull`, runs the benchmark tests, populates the PostgreSQL database, and cleans up the reports directory.

**Usage:**

```sh
./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" [--warmup warmup/warmup-1000bl-16wi-24tx.txt]
```

**Parameters:**

- `--table-name`: The database table name where benchmark data will be inserted.
- `--db-user`: The database user.
- `--db-host`: The database host.
- `--db-password`: The database password.
- `--warmup`: (Optional) The warmup file to use. Defaults to `warmup/warmup-1000bl-16wi-24tx.txt`.

**Examples:**

Using the default warmup file:
```sh
./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass"
```

Using a custom warmup file and run in background:
```sh
nohup ./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" --warmup "warmup/custom_warmup.txt" &
```

Prevent creating of `nohup.txt` (to save disk space):
```sh
nohup ./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" --warmup "warmup/custom_warmup.txt" > /dev/null 2>&1 &
```

## Integration with EELS

### Run the tests generated by the EELS framework

By default, gas-benchmarks replays Execution Layer Spec (EELS) payloads with the Kute runner.

You can use the [EELS repository](https://github.com/ethereum/execution-specs/tree/main/tests/benchmark) to produce new benchmark definitions and run them with the following steps:

1. Place a genesis file inside `scripts/genesisfiles/YOUR_CLIENT`, for example:
- `scripts/genesisfiles/geth/chainspec.json`
- `scripts/genesisfiles/geth/zkevmgenesis.json`
2. Capture EELS tests (writes payload files into `eest_tests/` by default):

```
python3 capture_eest_tests.py -o eest_tests -x 1M,worst_bytecode
```

- The capture script downloads the latest `benchmark@v*` release from the `execution-specs` repository. Ensure your new tests are included in a published benchmark release (or pass `--release-tag benchmark@vX.Y.Z`) before attempting to capture them.

3. Generate warmup payloads that match the captured tests:

```sh
python3 make_warmup_tests.py -s eest_tests/ -d warmup-tests -g scripts/genesisfiles/geth/zkevmgenesis.json
```

4. Execute the tests: perform step 4 from "Running the Benchmarks" using `eest_tests/` as the test path and the warmup file generated in step 3:

```
bash run.sh -t "eest_tests/" -w "warmup-tests" -c "client1,client2" -r runNumber -i "image1,image2"
```

All documentation now references `eest_tests/` as the canonical dataset; legacy `tests/` or `tests-vm/` folders are no longer shipped.

To author new EELS benchmarks, follow the guidance in the `execution-specs` repository under `tests/benchmark/`.

### Execute the tests directly from the EELS repository

1. Clone the EELS repository:

```
git clone https://github.com/ethereum/execution-specs.git
cd execution-specs
```

2. Create a virtual environment and install the dependencies:

```
uv sync --all-extras
```

3. Use [execute remote](https://github.com/ethereum/execution-specs/blob/main/docs/running_tests/execute/remote.md) to run the tests:

```
uv run execute remote -v --fork=Prague --rpc-seed-key=ACCOUNT --rpc-chain-id=1 --rpc-endpoint=http://127.0.0.1:8545 tests -- -m benchmark -n 1
```

Note: you will need an account with some ETH (native tokens) to run the tests. Specify the private key using `--rpc-seed-key`. Also, `--rpc-endpoint` should point to the node you want to test against (it may be a remote node).

### EELS stateful tests generator

The EELS stateful generator creates deterministic, reproducible execution payloads by running Execution Layer Spec tests against a local Nethermind node through a proxy. It consists of two cooperating tools:

- **`eest_stateful_generator.py`** - Main orchestrator that boots a Nethermind node from snapshot, runs EELS through a proxy, and captures payloads
- **`mitm_addon.py`** - mitmproxy addon that intercepts transactions and produces engine payloads grouped by test metadata

Basic usage:
```
python eest_stateful_generator.py \
--chain mainnet \
--fork Prague \
--rpc-seed-key YOUR_PRIVATE_KEY \
--rpc-address YOUR_ADDRESS \
--test-path tests
```

The tool writes `mitm_config.json` automatically (edit it if you need to tweak defaults). A generated example looks like:
```
{
"rpc_direct": "http://127.0.0.1:8545",
"engine_url": "http://127.0.0.1:8551",
"jwt_hex_path": "engine-jwt/jwt.hex",
"finalized_block": "0x..."
}
```

**Key notes:**

- The generator automatically handles node setup, health checks, JWT authentication, and cleanup.
- It uses OverlayFS for ephemeral node state (no mutation of base snapshot, the snapshot can be downloaded once and reused)
- Requires custom Nethermind image: `nethermindeth/nethermind:gp-hacked`
- Produces deterministic engine payloads without triggering reorgs

**Additional options:**

- `--chain`: Target network (mainnet, sepolia, holesky, goerli or ethereum (perfnet))
- `--no-snapshot`: Skip snapshot download
- `--keep`: Preserve containers and logs after completion
- `--refresh-snapshot`: Force re-download of snapshot

Important: This feature is still in development (See [PR description](https://github.com/NethermindEth/gas-benchmarks/pull/57)). The script writes `mitm_config.json` automatically; edit it only if you need to customize the generated values.

#### Opcode Tracing

The generator supports opcode tracing to capture which opcodes are executed during each test. This is useful for analyzing gas consumption patterns and test coverage.

**Generate opcode trace JSON:**

```sh
python eest_stateful_generator.py \
--data-dir scripts/nethermind/execution-data \
--genesis-path scripts/genesisfiles/nethermind/zkevmgenesis.json \
--nethermind-image nethermindeth/nethermind:gp-hacked \
--fork Prague \
--rpc-seed-key SOMEKEY \
--rpc-address SOMEADDRESS \
--payload-dir repricings_compute \
--eest-repo https://github.com/spencer-tb/execution-specs \
--eest-branch feat/fixed-opcode-count-updates \
--gas-bump-count 0 \
--trace-json \
--trace-json-output results.json
```

**Options:**

- `--trace-json`: Enable opcode tracing and generate JSON output mapping tests to opcode counts.
- `--trace-json-output`: Output path for the opcode trace results JSON.

**Output format:**

The generated JSON file contains a mapping of test names to their opcode execution counts:

```json
{
"test_arithmetic.py__test_arithmetic[fork_Prague-benchmark_test-opcode_ADD-opcount_1.0K]": {
"ADD": 1000,
"PUSH1": 28,
"JUMPDEST": 8,
"STOP": 1,
...
},
...
}
```

### Populating Test Metadata Database

After generating opcode trace results, you can populate a PostgreSQL database with the test metadata for Grafana visualization.

**Usage:**

```sh
python fill_tests_metadata_db.py \
--json-file results.json \
--db-host \
--db-port \
--db-user \
--db-password \
--db-name \
--table-name
```

**Parameters:**

- `--json-file`: Path to the JSON file containing opcode metrics (e.g., `results.json`).
- `--db-host`: PostgreSQL database host.
- `--db-port`: PostgreSQL database port (default: 5432).
- `--db-user`: PostgreSQL database user.
- `--db-password`: PostgreSQL database password.
- `--db-name`: PostgreSQL database name.
- `--table-name`: Name of the table to store metrics (default: `test_metadata`).
- `--clear-existing`: Optional flag to truncate existing data before importing.
- `--log-level`: Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL).

**Example:**

```sh
python fill_tests_metadata_db.py \
--json-file results_fixed.json \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-password "securepassword123" \
--db-name benchmarks \
--table-name gas_limit_benchmarks_test_metadata
```

**Database Schema:**

The script creates a table with the following structure:

```sql
CREATE TABLE test_metadata (
id SERIAL PRIMARY KEY,
test_name TEXT UNIQUE NOT NULL,
opcodes JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
)
```

**Update Behavior:**

When running the script multiple times:

- Tests that exist in the database and the new JSON file will have their `opcodes` updated and `updated_at` timestamp refreshed.
- Tests that exist in the database but not in the new JSON file remain unchanged.
- New tests are inserted with both `created_at` and `updated_at` set to the current time.

**Grafana Queries:**

Example queries for visualizing opcode data in Grafana:

```sql
-- Get all opcodes for a specific test
SELECT test_name, key as opcode, value::int as count
FROM gas_limit_benchmarks_test_metadata, jsonb_each_text(opcodes)
WHERE test_name LIKE '%SELFBALANCE%';

-- Top opcodes by total usage across all tests
SELECT key as opcode, SUM(value::int) as total_count
FROM gas_limit_benchmarks_test_metadata, jsonb_each_text(opcodes)
GROUP BY key
ORDER BY total_count DESC
LIMIT 20;

-- Compare specific opcodes across tests
SELECT test_name,
opcodes->>'ADD' as "ADD",
opcodes->>'MUL' as "MUL",
opcodes->>'SLOAD' as "SLOAD"
FROM gas_limit_benchmarks_test_metadata
WHERE opcodes ? 'ADD'
ORDER BY (opcodes->>'ADD')::int DESC;
```

Contributing: see [CONTRIBUTING.md](CONTRIBUTING.md)

## Reusable GitHub Action: `gas-benchmark-action`

You can trigger gas-benchmarks from another repository using the composite action defined in this repo at `.github/actions/gas-benchmark-action/action.yml`.

### Usage from an external repository

Create a workflow in your repository, for example `.github/workflows/gas-benchmarks.yml`:

```yaml
name: Run gas benchmarks

on:
workflow_dispatch:
inputs:
runs:
description: Number of benchmark runs
required: false
default: 1

jobs:
gas-benchmarks:
runs-on: ubuntu-latest # default runner; override to change VM

steps:
# 1. Checkout the repository with LFS enabled
- name: Checkout gas-benchmarks repository
uses: actions/checkout@v4
with:
repository: NethermindEth/gas-benchmarks
ref: main # or specific branch/tag
lfs: true # Critical for downloading genesis files correctly

# 2. Run the local action from the checkout path
- name: Run gas-benchmarks composite action
uses: ./.github/actions/gas-benchmark-action
with:
# Benchmark configuration (all optional, with sensible defaults)
testPath: eest_tests
genesisFile: zkevmgenesis.json
warmupFile: warmup/warmup-1000bl-16wi-24tx.txt
clients: nethermind,geth,reth,besu,erigon,nimbus,ethrex
runs: ${{ github.event.inputs.runs }}
opcodeWarmupCount: '1'
filter: ''
images: '{"nethermind":"default","geth":"default","reth":"default","erigon":"default","besu":"default","nimbus":"default","ethrex":"default"}'

# PostgreSQL target (no DB is provisioned by this workflow)
# Leave empty to disable database posting
postgresHost: perfnet.core.nethermind.dev
postgresPort: '5432'
postgresDbName: monitoring
postgresTable: gas_benchmarks_ci

# DB credentials passed from your repository secrets
postgresUser: ${{ secrets.PERFNET_DB_USER }}
postgresPassword: ${{ secrets.PERFNET_DB_PASSWORD }}
```

### Inputs

- **testPath** (string, default `eest_tests`): Path to benchmark tests relative to the `gas-benchmarks` repo.
- **genesisFile** (string, default `zkevmgenesis.json`): Genesis file name resolved under `scripts/genesisfiles//`.
- **warmupFile** (string, default `warmup/warmup-1000bl-16wi-24tx.txt`): Warmup payload file; set to empty string to disable warmup.
- **clients** (string, default `nethermind,geth,reth,besu,erigon,nimbus,ethrex`): Comma-separated client list.
- **runs** (string, default `'1'`): Number of benchmark iterations.
- **opcodeWarmupCount** (string, default `'1'`): Per-scenario opcode warmup loops.
- **filter** (string, default empty): Comma-separated case-insensitive substrings; only matching scenarios are executed.
- **images** (string, default JSON map of `default` tags): JSON map of client → image tag, same format as `multi-parallel.yml`.
- **txtReport** (string, default `'false'`): When set to `'true'`, also generates a TXT report via `report_txt.py`.
- **postgresHost** (string, optional, default empty): When non-empty, enables posting metrics to PostgreSQL.
- **postgresPort** (string, default `5432`): PostgreSQL port.
- **postgresDbName** (string, optional, default empty): PostgreSQL database name.
- **postgresTable** (string, optional, default empty): Target table name in the database.

### Secrets

- **postgresUser** (optional, default empty): PostgreSQL username, passed as a GitHub Actions secret.
- **postgresPassword** (optional, default empty): PostgreSQL password, passed as a GitHub Actions secret.

### Behavior and prerequisites

- **Single run per trigger**: The reusable workflow runs a single benchmark job per invocation (no parallel matrix), but can execute multiple clients in one `run.sh` call.
- **Preconditions**: The workflow clones `NethermindEth/gas-benchmarks`, installs Python dependencies from `requirements.txt`, runs `make prepare_tools`, and then calls `run.sh` with the provided inputs.
- **PostgreSQL (optional)**: If `postgresHost` is left empty, no database writes are attempted and only artifacts are produced. If any PostgreSQL-related input is provided, the action validates that `postgresHost`, `postgresDbName`, `postgresTable`, `postgresUser`, and `postgresPassword` are all non-empty before calling `fill_postgres_db.py`; otherwise, it fails fast with a clear error.
- **Artifacts**: The `reports/`, `reports.zip`, and `results/` directories are uploaded as a single `gas-benchmarks-outputs` artifact for further inspection in the caller repository.