https://github.com/nethermindeth/gas-benchmarks
Gas benchmark research repository
https://github.com/nethermindeth/gas-benchmarks
Last synced: about 1 month ago
JSON representation
Gas benchmark research repository
- Host: GitHub
- URL: https://github.com/nethermindeth/gas-benchmarks
- Owner: NethermindEth
- License: mit
- Created: 2024-02-22T12:34:37.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2026-04-06T22:28:15.000Z (2 months ago)
- Last Synced: 2026-04-07T00:27:08.789Z (2 months ago)
- Language: Python
- Size: 1.79 GB
- Stars: 21
- Watchers: 7
- Forks: 18
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Gas Benchmarks
This repository contains scripts to run benchmarks across multiple clients.
Follow the instructions below to run the benchmarks locally.
## Prerequisites
Make sure you have the following installed on your system:
- Python 3.10
- Docker
- Docker Compose
- .NET 8.0.x
- `make` (for running make commands)
- `git lfs`, `zip` (additional tools)
## Setup
1. **Clone the repository:**
```sh
git clone https://github.com/nethermindeth/gas-benchmarks.git
cd gas-benchmarks
```
2. **Install Python dependencies:**
```sh
pip install -r requirements.txt
```
2.1.**Install additional tools:**
```sh
git lfs install
git lfs pull
sudo apt install zip
```
3. **Prepare Kute dependencies (specific to Nethermind):**
```sh
make prepare_tools
```
4. **Create a results directory:**
```sh
mkdir -p results
```
## Running the Benchmarks
### Script: Run all
For running the whole pipeline, you can use the `run.sh` script.
```sh
bash run.sh -t "testsPath" -w "warmupFilePath" -c "client1,client2" -r runNumber -i "image1,image2"
```
Example run:
```shell
run.sh -t "eest_tests/" -w "warmup-tests" -c "nethermind,geth,reth" -r 8
```
Flags:
- `--t` it's used to define the path where the tests are located.
- `--w` it's used to define the path where the warmup file is located.
- `--c` it's used to define the clients that you want to run the benchmarks. Separate the clients with a comma.
- `--r` it's used to define the number of iterations that you want to run the benchmarks. It's a numeric value.
- `--i` it's used to define the images that you want to use to run the benchmarks. Separate the images with a comma, and match the clients. Use `default` if you want to ignore the values.
Now you're ready to run the benchmarks locally!
## Populating the PostgreSQL Database with Benchmark Data
After running benchmarks and generating report files, you can populate a PostgreSQL database with the results for further analysis. This process involves two main scripts: `generate_postgres_schema.py` to set up the database table, and `fill_postgres_db.py` to load the data.
### 1. Setting up the Database Schema
The `generate_postgres_schema.py` script creates the necessary table in your PostgreSQL database to store the benchmark data.
**Usage:**
```sh
python generate_postgres_schema.py \
--db-host \
--db-port \
--db-user \
--db-name \
--table-name \
--log-level
```
- You will be prompted to enter the password for the specified database user.
- `--table-name`: Defaults to `benchmark_data`.
- `--log-level`: Defaults to `INFO`.
**Example:**
```sh
python generate_postgres_schema.py \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-name benchmarks \
--table-name gas_benchmark_results
```
This will create a table named `gas_benchmark_results` (if it doesn't already exist) in the `benchmarks` database.
### 2. Populating the Database with Benchmark Data
Once the schema is set up, use `fill_postgres_db.py` to parse the benchmark report files (generated by `run.sh` or other means) and insert the data into the PostgreSQL table.
**Usage:**
```sh
python fill_postgres_db.py \
--reports-dir \
--db-host \
--db-port \
--db-user \
--db-password \
--db-name \
--table-name \
--log-level
```
- `--reports-dir`: Path to the directory containing the benchmark output files (e.g., `output_*.csv`, `raw_results_*.csv`, and `index.html` or `computer_specs.txt`).
- `--db-password`: The password for the database user.
- `--table-name`: Should match the table name used with `generate_postgres_schema.py`. Defaults to `benchmark_data`.
- `--log-level`: Defaults to `INFO`.
**Example:**
```sh
python fill_postgres_db.py \
--reports-dir ./results/my_benchmark_run_01 \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-password "securepassword123" \
--db-name benchmarks \
--table-name gas_benchmark_results
```
This script will scan the specified reports directory, parse the client benchmark data and computer specifications, and insert individual run records into the `gas_benchmark_results` table.
## Continuous Metrics Posting
A new script `run_and_post_metrics.sh` is available to run the benchmarks and post metrics continuously in an infinite loop.
This script updates the local repository with `git pull`, runs the benchmark tests, populates the PostgreSQL database, and cleans up the reports directory.
**Usage:**
```sh
./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" [--warmup warmup/warmup-1000bl-16wi-24tx.txt]
```
**Parameters:**
- `--table-name`: The database table name where benchmark data will be inserted.
- `--db-user`: The database user.
- `--db-host`: The database host.
- `--db-password`: The database password.
- `--warmup`: (Optional) The warmup file to use. Defaults to `warmup/warmup-1000bl-16wi-24tx.txt`.
**Examples:**
Using the default warmup file:
```sh
./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass"
```
Using a custom warmup file and run in background:
```sh
nohup ./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" --warmup "warmup/custom_warmup.txt" &
```
Prevent creating of `nohup.txt` (to save disk space):
```sh
nohup ./run_and_post_metrics.sh --table-name gas_limit_benchmarks --db-user nethermind --db-host perfnet.core.nethermind.dev --db-password "MyPass" --warmup "warmup/custom_warmup.txt" > /dev/null 2>&1 &
```
## Integration with EELS
### Run the tests generated by the EELS framework
By default, gas-benchmarks replays Execution Layer Spec (EELS) payloads with the Kute runner.
You can use the [EELS repository](https://github.com/ethereum/execution-specs/tree/main/tests/benchmark) to produce new benchmark definitions and run them with the following steps:
1. Place a genesis file inside `scripts/genesisfiles/YOUR_CLIENT`, for example:
- `scripts/genesisfiles/geth/chainspec.json`
- `scripts/genesisfiles/geth/zkevmgenesis.json`
2. Capture EELS tests (writes payload files into `eest_tests/` by default):
```
python3 capture_eest_tests.py -o eest_tests -x 1M,worst_bytecode
```
- The capture script downloads the latest `benchmark@v*` release from the `execution-specs` repository. Ensure your new tests are included in a published benchmark release (or pass `--release-tag benchmark@vX.Y.Z`) before attempting to capture them.
3. Generate warmup payloads that match the captured tests:
```sh
python3 make_warmup_tests.py -s eest_tests/ -d warmup-tests -g scripts/genesisfiles/geth/zkevmgenesis.json
```
4. Execute the tests: perform step 4 from "Running the Benchmarks" using `eest_tests/` as the test path and the warmup file generated in step 3:
```
bash run.sh -t "eest_tests/" -w "warmup-tests" -c "client1,client2" -r runNumber -i "image1,image2"
```
All documentation now references `eest_tests/` as the canonical dataset; legacy `tests/` or `tests-vm/` folders are no longer shipped.
To author new EELS benchmarks, follow the guidance in the `execution-specs` repository under `tests/benchmark/`.
### Execute the tests directly from the EELS repository
1. Clone the EELS repository:
```
git clone https://github.com/ethereum/execution-specs.git
cd execution-specs
```
2. Create a virtual environment and install the dependencies:
```
uv sync --all-extras
```
3. Use [execute remote](https://github.com/ethereum/execution-specs/blob/main/docs/running_tests/execute/remote.md) to run the tests:
```
uv run execute remote -v --fork=Prague --rpc-seed-key=ACCOUNT --rpc-chain-id=1 --rpc-endpoint=http://127.0.0.1:8545 tests -- -m benchmark -n 1
```
Note: you will need an account with some ETH (native tokens) to run the tests. Specify the private key using `--rpc-seed-key`. Also, `--rpc-endpoint` should point to the node you want to test against (it may be a remote node).
### EELS stateful tests generator
The EELS stateful generator creates deterministic, reproducible execution payloads by running Execution Layer Spec tests against a local Nethermind node through a proxy. It consists of two cooperating tools:
- **`eest_stateful_generator.py`** - Main orchestrator that boots a Nethermind node from snapshot, runs EELS through a proxy, and captures payloads
- **`mitm_addon.py`** - mitmproxy addon that intercepts transactions and produces engine payloads grouped by test metadata
Basic usage:
```
python eest_stateful_generator.py \
--chain mainnet \
--fork Prague \
--rpc-seed-key YOUR_PRIVATE_KEY \
--rpc-address YOUR_ADDRESS \
--test-path tests
```
The tool writes `mitm_config.json` automatically (edit it if you need to tweak defaults). A generated example looks like:
```
{
"rpc_direct": "http://127.0.0.1:8545",
"engine_url": "http://127.0.0.1:8551",
"jwt_hex_path": "engine-jwt/jwt.hex",
"finalized_block": "0x..."
}
```
**Key notes:**
- The generator automatically handles node setup, health checks, JWT authentication, and cleanup.
- It uses OverlayFS for ephemeral node state (no mutation of base snapshot, the snapshot can be downloaded once and reused)
- Requires custom Nethermind image: `nethermindeth/nethermind:gp-hacked`
- Produces deterministic engine payloads without triggering reorgs
**Additional options:**
- `--chain`: Target network (mainnet, sepolia, holesky, goerli or ethereum (perfnet))
- `--no-snapshot`: Skip snapshot download
- `--keep`: Preserve containers and logs after completion
- `--refresh-snapshot`: Force re-download of snapshot
Important: This feature is still in development (See [PR description](https://github.com/NethermindEth/gas-benchmarks/pull/57)). The script writes `mitm_config.json` automatically; edit it only if you need to customize the generated values.
#### Opcode Tracing
The generator supports opcode tracing to capture which opcodes are executed during each test. This is useful for analyzing gas consumption patterns and test coverage.
**Generate opcode trace JSON:**
```sh
python eest_stateful_generator.py \
--data-dir scripts/nethermind/execution-data \
--genesis-path scripts/genesisfiles/nethermind/zkevmgenesis.json \
--nethermind-image nethermindeth/nethermind:gp-hacked \
--fork Prague \
--rpc-seed-key SOMEKEY \
--rpc-address SOMEADDRESS \
--payload-dir repricings_compute \
--eest-repo https://github.com/spencer-tb/execution-specs \
--eest-branch feat/fixed-opcode-count-updates \
--gas-bump-count 0 \
--trace-json \
--trace-json-output results.json
```
**Options:**
- `--trace-json`: Enable opcode tracing and generate JSON output mapping tests to opcode counts.
- `--trace-json-output`: Output path for the opcode trace results JSON.
**Output format:**
The generated JSON file contains a mapping of test names to their opcode execution counts:
```json
{
"test_arithmetic.py__test_arithmetic[fork_Prague-benchmark_test-opcode_ADD-opcount_1.0K]": {
"ADD": 1000,
"PUSH1": 28,
"JUMPDEST": 8,
"STOP": 1,
...
},
...
}
```
### Populating Test Metadata Database
After generating opcode trace results, you can populate a PostgreSQL database with the test metadata for Grafana visualization.
**Usage:**
```sh
python fill_tests_metadata_db.py \
--json-file results.json \
--db-host \
--db-port \
--db-user \
--db-password \
--db-name \
--table-name
```
**Parameters:**
- `--json-file`: Path to the JSON file containing opcode metrics (e.g., `results.json`).
- `--db-host`: PostgreSQL database host.
- `--db-port`: PostgreSQL database port (default: 5432).
- `--db-user`: PostgreSQL database user.
- `--db-password`: PostgreSQL database password.
- `--db-name`: PostgreSQL database name.
- `--table-name`: Name of the table to store metrics (default: `test_metadata`).
- `--clear-existing`: Optional flag to truncate existing data before importing.
- `--log-level`: Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL).
**Example:**
```sh
python fill_tests_metadata_db.py \
--json-file results_fixed.json \
--db-host localhost \
--db-port 5432 \
--db-user myuser \
--db-password "securepassword123" \
--db-name benchmarks \
--table-name gas_limit_benchmarks_test_metadata
```
**Database Schema:**
The script creates a table with the following structure:
```sql
CREATE TABLE test_metadata (
id SERIAL PRIMARY KEY,
test_name TEXT UNIQUE NOT NULL,
opcodes JSONB NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
)
```
**Update Behavior:**
When running the script multiple times:
- Tests that exist in the database and the new JSON file will have their `opcodes` updated and `updated_at` timestamp refreshed.
- Tests that exist in the database but not in the new JSON file remain unchanged.
- New tests are inserted with both `created_at` and `updated_at` set to the current time.
**Grafana Queries:**
Example queries for visualizing opcode data in Grafana:
```sql
-- Get all opcodes for a specific test
SELECT test_name, key as opcode, value::int as count
FROM gas_limit_benchmarks_test_metadata, jsonb_each_text(opcodes)
WHERE test_name LIKE '%SELFBALANCE%';
-- Top opcodes by total usage across all tests
SELECT key as opcode, SUM(value::int) as total_count
FROM gas_limit_benchmarks_test_metadata, jsonb_each_text(opcodes)
GROUP BY key
ORDER BY total_count DESC
LIMIT 20;
-- Compare specific opcodes across tests
SELECT test_name,
opcodes->>'ADD' as "ADD",
opcodes->>'MUL' as "MUL",
opcodes->>'SLOAD' as "SLOAD"
FROM gas_limit_benchmarks_test_metadata
WHERE opcodes ? 'ADD'
ORDER BY (opcodes->>'ADD')::int DESC;
```
Contributing: see [CONTRIBUTING.md](CONTRIBUTING.md)
## Reusable GitHub Action: `gas-benchmark-action`
You can trigger gas-benchmarks from another repository using the composite action defined in this repo at `.github/actions/gas-benchmark-action/action.yml`.
### Usage from an external repository
Create a workflow in your repository, for example `.github/workflows/gas-benchmarks.yml`:
```yaml
name: Run gas benchmarks
on:
workflow_dispatch:
inputs:
runs:
description: Number of benchmark runs
required: false
default: 1
jobs:
gas-benchmarks:
runs-on: ubuntu-latest # default runner; override to change VM
steps:
# 1. Checkout the repository with LFS enabled
- name: Checkout gas-benchmarks repository
uses: actions/checkout@v4
with:
repository: NethermindEth/gas-benchmarks
ref: main # or specific branch/tag
lfs: true # Critical for downloading genesis files correctly
# 2. Run the local action from the checkout path
- name: Run gas-benchmarks composite action
uses: ./.github/actions/gas-benchmark-action
with:
# Benchmark configuration (all optional, with sensible defaults)
testPath: eest_tests
genesisFile: zkevmgenesis.json
warmupFile: warmup/warmup-1000bl-16wi-24tx.txt
clients: nethermind,geth,reth,besu,erigon,nimbus,ethrex
runs: ${{ github.event.inputs.runs }}
opcodeWarmupCount: '1'
filter: ''
images: '{"nethermind":"default","geth":"default","reth":"default","erigon":"default","besu":"default","nimbus":"default","ethrex":"default"}'
# PostgreSQL target (no DB is provisioned by this workflow)
# Leave empty to disable database posting
postgresHost: perfnet.core.nethermind.dev
postgresPort: '5432'
postgresDbName: monitoring
postgresTable: gas_benchmarks_ci
# DB credentials passed from your repository secrets
postgresUser: ${{ secrets.PERFNET_DB_USER }}
postgresPassword: ${{ secrets.PERFNET_DB_PASSWORD }}
```
### Inputs
- **testPath** (string, default `eest_tests`): Path to benchmark tests relative to the `gas-benchmarks` repo.
- **genesisFile** (string, default `zkevmgenesis.json`): Genesis file name resolved under `scripts/genesisfiles//`.
- **warmupFile** (string, default `warmup/warmup-1000bl-16wi-24tx.txt`): Warmup payload file; set to empty string to disable warmup.
- **clients** (string, default `nethermind,geth,reth,besu,erigon,nimbus,ethrex`): Comma-separated client list.
- **runs** (string, default `'1'`): Number of benchmark iterations.
- **opcodeWarmupCount** (string, default `'1'`): Per-scenario opcode warmup loops.
- **filter** (string, default empty): Comma-separated case-insensitive substrings; only matching scenarios are executed.
- **images** (string, default JSON map of `default` tags): JSON map of client → image tag, same format as `multi-parallel.yml`.
- **txtReport** (string, default `'false'`): When set to `'true'`, also generates a TXT report via `report_txt.py`.
- **postgresHost** (string, optional, default empty): When non-empty, enables posting metrics to PostgreSQL.
- **postgresPort** (string, default `5432`): PostgreSQL port.
- **postgresDbName** (string, optional, default empty): PostgreSQL database name.
- **postgresTable** (string, optional, default empty): Target table name in the database.
### Secrets
- **postgresUser** (optional, default empty): PostgreSQL username, passed as a GitHub Actions secret.
- **postgresPassword** (optional, default empty): PostgreSQL password, passed as a GitHub Actions secret.
### Behavior and prerequisites
- **Single run per trigger**: The reusable workflow runs a single benchmark job per invocation (no parallel matrix), but can execute multiple clients in one `run.sh` call.
- **Preconditions**: The workflow clones `NethermindEth/gas-benchmarks`, installs Python dependencies from `requirements.txt`, runs `make prepare_tools`, and then calls `run.sh` with the provided inputs.
- **PostgreSQL (optional)**: If `postgresHost` is left empty, no database writes are attempted and only artifacts are produced. If any PostgreSQL-related input is provided, the action validates that `postgresHost`, `postgresDbName`, `postgresTable`, `postgresUser`, and `postgresPassword` are all non-empty before calling `fill_postgres_db.py`; otherwise, it fails fast with a clear error.
- **Artifacts**: The `reports/`, `reports.zip`, and `results/` directories are uploaded as a single `gas-benchmarks-outputs` artifact for further inspection in the caller repository.