{"id":22343159,"url":"https://github.com/johnramsden/zbdbench","last_synced_at":"2025-03-26T09:43:08.757Z","repository":{"id":265909996,"uuid":"895852283","full_name":"johnramsden/zbdbench","owner":"johnramsden","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-29T03:37:06.000Z","size":144,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-24T12:51:22.310Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/johnramsden.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-29T03:35:00.000Z","updated_at":"2024-11-29T03:37:17.000Z","dependencies_parsed_at":"2024-12-02T02:38:13.425Z","dependency_job_id":null,"html_url":"https://github.com/johnramsden/zbdbench","commit_stats":null,"previous_names":["johnramsden/zbdbench"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnramsden%2Fzbdbench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnramsden%2Fzbdbench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnramsden%2Fzbdbench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnramsden%2Fzbdbench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/johnramsden","download_url":"https://codeload.github.com/johnramsden/zbdbench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245631336,"owners_count":20647181,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-04T08:15:01.561Z","updated_at":"2025-03-26T09:43:08.735Z","avatar_url":"https://github.com/johnramsden.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ZBDBench: Benchmark Suite for Zoned Block Devices\n\nZBDBench is a collection of benchmarks for zoned storage devices (Zoned Namespace (ZNS) SSDs and Shingled-Magnetic Recording (SMR) HDDs) that tests both the raw performance of the device, and runs standard benchmarks for applications such as RocksDB (dbbench) and MySQL (sysbench).\n\nCommunity\n---------\nFor help or questions about zbdbench usage (e.g. \"how do I do X?\") see [ZonedStorage.io](https://zonedstorage.io), our [Matrix](https://app.element.io/#/room/#zonedstorage-general:matrix.org) chat, or on [Slack](https://join.slack.com/t/zonedstorage/shared_invite/zt-uyfut5xe-nKajp9YRnEWqiD4X6RkTFw).\n\n\nTo report a bug, file a documentation issue, or submit a feature request, please open a GitHub issue.\n\nFor release announcements and other discussions, please subscribe to this repository or join us on Matrix.\n\nDependencies\n------------\n\nThe benchmark tool requires Python 3.4+. In addition to a working python\nenvironment, the script requires the following installed:\n\n - Linux kernel 5.9 or newer\n   - Check your loaded kernel version using:\n     `uname -a`\n\n - nvme-cli\n   - Ubuntu: `sudo apt-get install nvme-cli`\n   - Fedora: `sudo dnf -y install nvme-cli`\n\n - blkzone and blkdiscard (available through util-linux)\n   - Ubuntu: `sudo apt-get install util-linux`\n   - Fedora: `sudo dnf -y install util-linux-ng`\n   - CentOS: `sudo yum -y install util-linux-ng`\n\n - a valid container (podman) environment\n   - If you do not have a container environment installed, please see [this\n     link](https://podman.io/getting-started/installation)\n\n - installed containers:\n   - zfio - contains latest fio compiled with zone capacity support\n   - zrocksdb - contains rocksdb with zenfs built-in\n   - zzenfs - contains the zenfs tool to inspect the zenfs file-system\n\n   The containers can be installed with:\n     `cd recipes/docker; sudo ./build.sh`\n\n   The container installation can be verified by listing the image:\n     `sudo podman images`\n\n  - matplotlib, pandas and openpyxl for graph plotting\n    ```\n    sudo pip install matplotlib\n    sudo pip install pandas\n    sudo pip install openpyxl\n    ```\n\nGetting Started\n---------------\n\nThe run.py script runs a predefined benchmark on a block device.\n\nThe block device does not have to be zoned - the workloads will work\non both types of block devices.\n\nThe script performs a set of checks before running the benchmark, such as\nvalidating that it is about to write to a block device, not mounted, and ready.\n\nAfter the benchmark has run, the output is available in:\n\n    zbdbench_results/YYYYMMDDHHMMSS (date format is replaced with the current time)\n\nEach benchmark has a report function, which creates a csv file with the\nspecific output. See the section below for the csv format for each benchmark.\n\nTo execute the 'fio_zone_mixed' benchmark, run:\n\n    sudo ./run.py -d /dev/nvmeXnY -b fio_zone_mixed\n\nIf you have the latest fio installed, you may skip the container installation and\nrun the benchmarks using the system commands.\n\n    sudo ./run.py -d /dev/nvmeXnY -b fio_zone_mixed -c no\n\nTo list available benchmarks, run:\n\n    ./run.py -l\n\n## WARNING\n\nYou need to have read/write permissions to the device or file you are\ntargeting. Usually block devices are owned by `root` user or `disk` group. You\ncan either change ownership of the block device your are testing:\n\n    sudo chown myusername /dev/nvmeXnY\n\nor make it world writable:\n\n    sudo chmod o+rw /dev/nvmeXnY\n\nOr elevate the privileges when running `zbdbench`:\n\n    sudo ./run.py \u003cargs\u003e\n\nPlease be sure that you are familiar with the security implications of the\noption you choose. If you start a test on a different block device than the one\nyou intended, you may loose data and your system may fail to boot.\n\nCommand Options\n---------------\n\nList available benchmarks:\n\n    ./run.py -l\n\nRun specific benchmark:\n\n    ./run.py -b benchmark -d /dev/nvmeXnY\n\nRun fio_zone_xxx benchmark with SPDK FIO plugin(io_uring zoned bdev) in a container env.:\n\n    ./run.py -b fio_zone_xxx --mq-deadline-scheduler -d /dev/nvmeXnY -s yes -c yes\n\nRun fio_zone_xxx benchmark with SPDK FIO plugin(io_uring zoned bdev) directly on Host System.\nZbdbench will checkout and build SPDK(also FIO) in dir provided using --spdk-path option:\n\n    ./run.py -b fio_zone_xxx --mq-deadline-scheduler -d /dev/nvmeXnY -s yes -c no --spdk-path /dir/path\n\nRegenerate a report (and its plots)\n\n    ./run.py -b fio_zone_mixed -r zbdbench_results/YYYYMMDDHHMMSS\n\nRegenerate plots from existing csv report\n\n    ./run.py -b fio_zone_throughput_avg_lat -p zbdbench_results/YYYYMMDDHHMMSS/fio_zone_throughput_avg_lat.csv\n\nOverwrite benchmark run with the none device scheduler:\n\n    ./run.py -b benchmark -d /dev/nvmeXnY --none-scheduler\n\nOverwrite benchmark run with the mq-deadline device scheduler:\n\n    ./run.py -b benchmark -d /dev/nvmeXnY --mq-deadline-scheduler\n\nBenchmarks\n----------\n\n- All fio benchmarks are setting the none scheduler by default if the iodepth is 1.\n- When doing random fio workloads, the `norandommap` fio option is set.\n\nSPDK FIO plugin support:\n  - Following benchmarks have SPDK FIO plugin support\n     - fio_zone_write\n     - fio_zone_mixed\n     - fio_zone_throughput_avg_lat\n  - Adding SPDK FIO plugin support for a new benchmark\n     - See benchs/template.py for guidance\n\n## fio_steady_state_performance\n  - Puts the (conventional) drive into its steady state by completely filling it\n    and then overwriting it. This puts conventional block devices into the state\n    where the on device garbage colletion is working to free up space.\n\n  - (Random) Read and (Random) Write performance of the drive is subseqently messured.\n\n### fio_zone_write\n  - executes a fio workload that writes sequential to 14 zones in parallel and\n    while writing 6 times the capacity of the device.\n\n  - generated csv output (fio_zone_write.csv)\n    1. written_gb: gigabytes written (GB)\n    2. write_avg_mbs: average throughput (MB/s)\n\n### fio_zone_mixed\n  - executes a fio workload that first preconditions the block device to steady\n    state. Then rate limited writes are issued, in which 4KB random reads\n    are issued in parallel. The average latency for the 4KB random read is\n    reported.\n\n  - generated csv output (fio_zone_mixed.csv)\n    1. write_avg_mbs_target: target write throughput (MB/s)\n    2. read_lat_avg_us: avg 4KB random read latency (us)\n    3. write_avg_mbs: write throughput (MB/s)\n    4. read_lat_us_avg_measured: avg 4KB random read latency (us)\n    5. clat_*_us: Latency percentiles\n\n    ** Note that (2) is only reported if write_avg_mbs_target and write_avg_mbs\n       are equal. When they are not equal, the reported average latency is\n       misleading, as the write throughput requested has not been possible to\n       achieve.\n\n### fio_zone_throughput_avg_lat\n  - Executes all combinations of the following workloads report the throughput\n    and latency in the csv report (Note: 14 is a possible value for max_open_zones):\n      - Sequential read, random read, sequential write\n      - BS: 4K, 8K, 16K, 32K, 64K, 128K\n      - Sequential write and sequential read specific:\n        - Number of parallel jobs: 1, 2, 4, 8, 14, 16, 32, 64, 128 (skipping entries \u003e max_open_zones)\n        - QD: 1\n        - ioengine: psync\n      - Random read specific:\n        - QD: 1, 2, 4, 8, 14, 16, 32, 64, 128\n        - ioengine: io_uring\n\n    For reads the drive is prepared with a write. The ZBD is reset before each\n    run.\n\n  - Generated csv output file is fio_zone_throughput_avg_lat.csv\n    1. avg_lat_us: Average latency in µs for the specific run.\n    2. throughput_MiBs: Throughput in MiBs for the specific run.\n    3. clat_p1_us - clat_p100us: completion latency percentiles in µs.\n\n  - Generates multiple graphs that plot the behavior of throughput and latency.\n\n### usenix_atc_2021_zns_eval\n  Executes RocksDB's db_bench according to the RocksDB evaluation section\n  (5.2 RocksDB) of the paper '[ZNS: Avoiding the Block Interface Tax for\n  Flash-based SSDs](https://www.pdl.cmu.edu/PDL-FTP/Storage/USENIX_ATC_2021_ZNS.pdf)'.\n\n  Depending on if the specified drive to benchmark is a ZNS or Conventional\n  device different benchmarks are run.\n  - For conventional devices the db_bench workload is run on the following\n    filesystems:\n        - xfs\n        - f2fs\n  - For ZNS devices the db_bench workload is run on the f2fs filesystem and\n    with the ZenFS RocksDB plugin without an additional filesystem.\n\n  Note: the tests are designed to run on 2TB devices.\n\n### sysbench\n  Executes a sysbench workload within a percona-server MyRocks installation.\n  For conventional devices, the default filesystem will be xfs whereas for\n  ZBD devices by default the benchmark will be issued through ZenFS, the\n  RocksDB plugin which enables direct access to zoned storage.\n  If the `-x btrfs` is supplied the benchmark will run on zoned or\n  conventional devices with btrfs as the filesystem.\n\n  The benchmark will first bulk-load the drive with a database of about 800GB.\n  10 million `db-entries` correspond to ~2GB of capacity.\n  With `200.000.000 table-size * 20 tables = 4000M db-entries` the database\n  size will result in 800GB.\n  After that the following oltp workloads are run each for 30 minutes in the\n  given order:\n  - oltp_update_index.lua\n  - oltp_update_non_index.lua\n  - oltp_delete.lua\n  - oltp_write_only.lua\n  - oltp_insert.lua\n  - oltp_read_write.lua\n  - oltp_read_only.lua\n\nAdvance Data Analysis using SQLite\n----------------------\nBenchmarks can implement to collect their CSV report into a SQLite database.\nSee `data_collector/sqlite_data_collector.py`\n\nThe database file `data-collection.sqlite3` will be created/modified in the\ngiven output directory (by default `zbdbench_results`)\n\nThe database design is keeped in an easy format. Each ZBDBench benchmarking run\ncauses an entry in the `zbdbench_run` table which collects general system\ninformation.\nEach ZBDBench run can generate multiple results that are collected in a\nbenchmark specific table (e.g. `fio_zone_throughput_avg_lat`)\n\nTODO: Add graph for the database layout\n\nIn case you want to connect your SQLite DB with Excel you need to install the\nMySQL ODBC https://dev.mysql.com/downloads/connector/odbc/ .\n\nOn MacOS also install iOBDC http://www.iodbc.org/dataspace/doc/iodbc/wiki/iodbcWiki/Downloads .\nCopy /usr/local/mysql-connector-odbc-8.0.12-macos10.13-x86-64bit to\n/Library/ODBC and adjust /Library/ODBC/odbcinst.init\nhttps://stackoverflow.com/questions/52896893/macos-connector-mysql-odbc-driver-could-not-be-loaded-in-excel-for-mac-2016 .\n\nIn the 'ODBC Data Source Administrator' a 'User DSN' needs to be created with the\nfollowing keywords and values:\n```\nSERVER \u003cIP\u003e\nNO_SCHEMA 1\n```\n\nWithin Excel in the 'Data' tab you can 'Get Data' 'From Database (Microsoft Query)' with\nthe specified 'User DSN' and the following query:\n\n```\nSELECT * FROM fio_zone_throughput_avg_lat INNER JOIN zbdbench_run ON fio_zone_throughput_avg_lat.zbdbench_run_id = zbdbench_run.id;\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnramsden%2Fzbdbench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohnramsden%2Fzbdbench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnramsden%2Fzbdbench/lists"}