{"id":15032562,"url":"https://github.com/bytecodealliance/sightglass","last_synced_at":"2025-04-12T18:51:56.355Z","repository":{"id":37406764,"uuid":"169305958","full_name":"bytecodealliance/sightglass","owner":"bytecodealliance","description":"A benchmark suite and tool to compare different implementations of the same primitives.","archived":false,"fork":false,"pushed_at":"2025-02-27T16:58:41.000Z","size":30213,"stargazers_count":73,"open_issues_count":36,"forks_count":33,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-04-03T22:07:41.411Z","etag":null,"topics":["benchmark","benchmarking","rust","rust-lang"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bytecodealliance.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE-APACHE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-05T20:14:50.000Z","updated_at":"2025-02-27T16:58:45.000Z","dependencies_parsed_at":"2023-07-13T23:16:55.325Z","dependency_job_id":"a40639bb-0f20-4216-b135-f51628f87474","html_url":"https://github.com/bytecodealliance/sightglass","commit_stats":{"total_commits":265,"total_committers":15,"mean_commits":"17.666666666666668","dds":0.5622641509433962,"last_synced_commit":"8bc0d50e8de8ebbdd41a6312713730f218eefcf3"},"previous_names":["fastly/sightglass"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytecodealliance%2Fsightglass","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytecodealliance%2Fsightglass/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytecodealliance%2Fsightglass/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytecodealliance%2Fsightglass/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bytecodealliance","download_url":"https://codeload.github.com/bytecodealliance/sightglass/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248618225,"owners_count":21134200,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","benchmarking","rust","rust-lang"],"created_at":"2024-09-24T20:18:44.067Z","updated_at":"2025-04-12T18:51:56.308Z","avatar_url":"https://github.com/bytecodealliance.png","language":"C","readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003e\u003ccode\u003esightglass\u003c/code\u003e\u003c/h1\u003e\n\n  \u003cp\u003e\n    \u003cstrong\u003eA benchmarking suite and tooling for Wasmtime and Cranelift\u003c/strong\u003e\n  \u003c/p\u003e\n\n  \u003cstrong\u003eA \u003ca href=\"https://bytecodealliance.org/\"\u003eBytecode Alliance\u003c/a\u003e project\u003c/strong\u003e\n\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/bytecodealliance/sightglass/actions?query=workflow%3ACI\"\u003e\u003cimg src=\"https://github.com/bytecodealliance/sightglass/workflows/sightglass/badge.svg\" alt=\"build status\" /\u003e\u003c/a\u003e\n    \u003ca href=\"https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime\"\u003e\u003cimg src=\"https://img.shields.io/badge/zulip-join_chat-brightgreen.svg\" alt=\"zulip chat\" /\u003e\u003c/a\u003e\n    \u003cimg src=\"https://img.shields.io/badge/rustc-stable+-green.svg\" alt=\"supported rustc stable\" /\u003e\n  \u003c/p\u003e\n\n  \u003ch3\u003e\n    \u003ca href=\"https://github.com/bytecodealliance/sightglass/blob/main/CONTRIBUTING.md\"\u003eContributing\u003c/a\u003e\n    \u003cspan\u003e | \u003c/span\u003e\n    \u003ca href=\"https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime\"\u003eChat\u003c/a\u003e\n  \u003c/h3\u003e\n\u003c/div\u003e\n\n## About\n\nThis repository contains benchmarking infrastructure for Wasmtime and Cranelift,\nas described in [this\nRFC](https://github.com/bytecodealliance/rfcs/blob/main/accepted/benchmark-suite.md).\nIn particular, it has\n\n* a benchmark suite of Wasm applications in `benchmarks/*`, and\n\n* a benchmark runner CLI tool to record, analyze, and display benchmark results\n  in `crates/cli/*`.\n\nWe plan to implement a server that periodically runs benchmarks as new commits\nare pushed to Wasmtime and display the history of those benchmark results,\nsimilar to Firefox's [Are We Fast Yet?](https://arewefastyet.com). However, this\nwork is not completed yet. See [issue\n93](https://github.com/bytecodealliance/sightglass/issues/93) for details.\n\nResults are always broken down by *phase* \u0026mdash; compilation vs. instantiation\nvs. execution \u0026mdash; for each program in the suite. This allows us to reason\nabout, for example, compiler performance separately from its generated code\nquality. How all this works together:\n\n- each __benchmark__ is compiled to a `benchmark.wasm` module that calls two host functions,\n  `bench.start` and `bench.end`, to notify Sightglass of the portion of the execution to measure\n  (see the [benchmarks README])\n- we build an __engine__ (e.g., Wasmtime) as a shared library that implements the [bench API]; the\n  Sightglass infrastructure uses this to measure each phase (see an [engine README])\n- the __`sightglass-cli`__ tool runs __benchmarks__ using the __engines__ and emits __measurements__\n  for each phase; this is configurable, e.g., by various measurement mechanisms, various output\n  formats, different aggregations, etc.\n\n[benchmarks README]: benchmarks/README.md\n[engine README]: engines/wasmtime/README.md\n\n## This is *NOT* a General-Purpose WebAssembly Benchmark Suite\n\nThis benchmark suite and tooling is specifically designed for Wasmtime and\nCranelift, as explained in [the benchmarking suite\nRFC](https://github.com/bytecodealliance/rfcs/blob/main/accepted/benchmark-suite.md#nongoal-creating-a-general-purpose-webassembly-benchmark-suite):\n\n\u003e It is also worth mentioning this explicit non-goal: we do not intend to\n\u003e develop a general-purpose WebAssembly benchmark suite, used to compare between\n\u003e different WebAssembly compilers and runtimes. We don't intend to trigger a\n\u003e WebAssembly benchmarking war, reminiscent of JavaScript benchmarking wars in\n\u003e Web browsers. Doing so would make the benchmark suite's design high stakes,\n\u003e because engineers would be incentivized to game the benchmarks, and would\n\u003e additionally impose cross-engine portability constraints on the benchmark\n\u003e runner. We only intend to compare the performance of various versions of\n\u003e Wasmtime and Cranelift, where we don't need the cross-engine portability in\n\u003e the benchmark runner, and where gaming the benchmarks isn't incentivized.\n\u003e\n\u003e Furthermore, general-purpose WebAssembly benchmarking must include WebAssembly\n\u003e on the Web. Doing that well requires including interactions with the rest of\n\u003e the Web browser: JavaScript, rendering, and the DOM. Building and integrating\n\u003e a full Web browser is overkill for our purposes, and represents significant\n\u003e additional complexity that we would prefer to avoid.\n\nEven if someone did manage to get other Wasm engines hooked into this\nbenchmarking infrastructure, comparing results across engines would likely be\ninvalid. The `wasmtime-bench-api` intentionally does things that will likely\nhurt its absolute performance numbers but which help us more easily get\nstatistically meaningful results, like randomizing the locations of heap\nallocations. Without taking great care to level the playing field with respect\nto these sorts of tweaks, as well as keeping an eye on all engine specific\nconfiguration options, you'll end up comparing apples and oranges.\n\n## Usage\n\nYou can always see all subcommands and options via\n\n```\ncargo run -- help\n```\n\nThere are flags to control how many different processes we spawn and take\nmeasurements from, how many iterations we perform for each process, etc...\n\nThat said, here are a couple of typical usage scenarios.\n\n### Building the Runtime Engine for Wasmtime\n```\n$ cd engines/wasmtime \u0026\u0026 rustc build.rs \u0026\u0026 ./build \u0026\u0026 cd ../../\n```\n\n### Running the Default Benchmark Suite\n\n```\n$ cargo run -- benchmark --engine engines/wasmtime/libengine.so\n```\n\nThis runs all benchmarks listed in [`default.suite`](benchmarks/default.suite).\nThe output will be a summary of each benchmark program's compilation,\ninstantiation, and execution times.\n\n### Running a Single Wasm Benchmark\n\n```\n$ cargo run -- benchmark --engine engines/wasmtime/libengine.so -- path/to/benchmark.wasm\n```\n\nAppend multiple `*.wasm` paths to the end of that command to run multiple\nbenchmarks.\n\n### Running All Benchmarks\n\n```\n$ cargo run -- benchmark --engine engines/wasmtime/libengine.so -- benchmarks/all.suite\n```\n\n`*.suite` files contain relative paths of a list of benchmarks to run. This is a\nconvenience for organizing benchmarks but is functionally equivalent to listing\nall `*.wasm` paths at the end of the `benchmark` command.\n\n### Comparing a Feature Branch to Main\n\nFirst, build `libwasmtime_bench_api.so` (or `.dylib` or `.dll` depending on your\nOS) for the latest `main` branch:\n\n```\n$ cd ~/wasmtime\n$ git checkout main\n$ cargo build --release -p wasmtime-bench-api\n$ cp target/release/libwasmtime_bench_api.so /tmp/wasmtime_main.so\n```\n\nThen, checkout your feature branch and build its `libwasmtime_bench_api.so`:\n\n```\n$ git checkout my-feature\n$ cargo build --release -p wasmtime-bench-api\n```\n\nFinally, run the benchmarks and supply both versions of\n`libwasmtime_bench_api.so` via repeated use of the `--engine` flag:\n\n```\n$ cd ~/sightglass\n$ cargo run -- \\\n    benchmark \\\n    --engine /tmp/wasmtime_main.so \\\n    --engine ~/wasmtime/target/release/libwasmtime_bench_api.so \\\n    -- \\\n    benchmarks/all.suite\n```\n\nThe output will show a comparison between the `main` branch's results and your\nfeature branch's results, giving you an effect size and confidence interval\n(i.e. \"we are 99% confident that `my-feature` is 1.32x to 1.37x faster than\n`main`\" or \"there is no statistically significant difference in performance\nbetween `my-feature` and `main`\") for each benchmark Wasm program in the suite.\n\nAs you make further changes to your `my-feature` branch, you can execute this\ncommand whenever you want new, updated benchmark results:\n\n```\n$ cargo build --manifest-path ~/wasmtime/Cargo.toml --release -p wasmtime-bench-api \u0026\u0026 \\\n    cargo run --manifest-path ~/sightglass/Cargo.toml -- \\\n      benchmark \\\n      --engine /tmp/wasmtime_main.so \\\n      --engine ~/wasmtime/target/release/libwasmtime_bench_api.so \\\n      -- \\\n      benchmarks/all.suite\n```\n\n### Collecting Different Kinds of Results\n\nSightglass comes enabled with several different kinds of measurement mechanisms\n\u0026mdash; a _measure_.  The default _measure_ is `cycles`, which simply measures\nthe elapsed duration of CPU cycles for each phase (e.g., using `RDTSC`). The\naccuracy of this _measure_ is documented [here](crates/recorder/README.md) but\nnote that measuring using CPU cycles alone can be problematic (e.g., CPU\nfrequency changes, context switches, etc.).\n\nSeveral _measures_ can be configured using the `--measure` option:\n- `cycles`: the number of CPU cycles elapsed\n- `perf-counters`: a selection of common `perf` counters (CPU cycles,\n  instructions retired, cache accesses, cache misses); only available on Linux\n- `vtune`: record each phase as a VTune task for analysis; see [this help\n  documentation](docs/vtune.md) for more details\n- `noop`: no measurement is performed\n\nFor example, run:\n\n```\n$ cargo run -- benchmark --measure perf-counters ...\n```\n\n### Getting Raw JSON or CSV Results\n\nIf you don't want the results to be summarized and displayed in a human-readable\nformat, you can get raw JSON or CSV via the `--raw` flag:\n\n```\n$ cargo run -- benchmark --raw --output-format csv -- benchmark.wasm\n```\n\nThen you can use your own R/Python/spreadsheets/etc. to analyze and visualize\nthe benchmark results.\n\n### Adding a New Benchmark\n\nAdd a Dockerfile under `benchmarks/\u003cyour benchmark\u003e` building a Wasm file that\nbrackets the work to measure with the `bench.start` and `bench.end` host calls.\nSee the [benchmarks README] for a fuller set of requirements and the\n[`build.sh`] script for building this file.\n\n[`build.sh`]: benchmarks/build.sh\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytecodealliance%2Fsightglass","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbytecodealliance%2Fsightglass","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytecodealliance%2Fsightglass/lists"}