{"id":19112128,"url":"https://github.com/ocaml-multicore/multicore-bench","last_synced_at":"2025-04-30T22:05:34.084Z","repository":{"id":215703439,"uuid":"739595290","full_name":"ocaml-multicore/multicore-bench","owner":"ocaml-multicore","description":"Framework for benchmarking on multiple cores on current-bench","archived":false,"fork":false,"pushed_at":"2025-01-21T16:09:38.000Z","size":924,"stargazers_count":13,"open_issues_count":1,"forks_count":1,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-30T20:23:16.962Z","etag":null,"topics":["work-in-progress"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"isc","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ocaml-multicore.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-06T00:36:04.000Z","updated_at":"2025-02-19T10:39:56.000Z","dependencies_parsed_at":"2024-01-06T01:48:25.043Z","dependency_job_id":"deb064b1-5c12-4ea4-a7c7-1378bb1b65dd","html_url":"https://github.com/ocaml-multicore/multicore-bench","commit_stats":null,"previous_names":["ocaml-multicore/multicore-bench"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ocaml-multicore%2Fmulticore-bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ocaml-multicore%2Fmulticore-bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ocaml-multicore%2Fmulticore-bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ocaml-multicore%2Fmulticore-bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ocaml-multicore","download_url":"https://codeload.github.com/ocaml-multicore/multicore-bench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249565250,"owners_count":21292427,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["work-in-progress"],"created_at":"2024-11-09T04:31:39.993Z","updated_at":"2025-04-18T23:33:25.822Z","avatar_url":"https://github.com/ocaml-multicore.png","language":"OCaml","funding_links":[],"categories":[],"sub_categories":[],"readme":"[API reference](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/index.html)\n\u0026middot;\n[Benchmarks](https://bench.ci.dev/ocaml-multicore/multicore-bench/branch/main?worker=pascal\u0026image=bench.Dockerfile)\n\n# Multicore-bench\n\nMulticore bench is a framework for writing multicore benchmark executables to\nrun locally on your computer and on\n[current-bench](https://github.com/ocurrent/current-bench).\n\nBenchmarking multicore algorithms tends to require a certain amount of setup,\nsuch as spawning domains, synchronizing them before work, timing the work,\ncollecting the times, and joining domains, that this framework tries to take\ncare of for you as conveniently as possible. Furthermore, benchmarking multicore\nalgorithms in OCaml also involves a number of pitfalls related to how the OCaml\nruntime works. For example, when only a single domain is running, several\noperations provided by the OCaml runtime use specialized implementations that\ntake advantage of the fact that there is only a single domain running. In most\ncases, when trying to benchmark multicore algorithms, you don't actually want to\nmeasure those specialized runtime implementations.\n\nThe design of multicore bench is considered **_experimental_**. We are planning\nto improve the design along with\n[current-bench](https://github.com/ocurrent/current-bench) in the future to\nallow more useful benchmarking experience.\n\n## Crash course to [current-bench](https://github.com/ocurrent/current-bench)\n\nNote that, at the time of writing this,\n[current-bench](https://github.com/ocurrent/current-bench) is work in progress\nand does not accept enrollment for community projects. However, assuming you\nhave access to it, to run multicore benchmarks with\n[current-bench](https://github.com/ocurrent/current-bench) a number of things\nneed to be setup:\n\n- You will need a [Makefile](Makefile) with a `bench` target at the root of the\n  project. The [current-bench](https://github.com/ocurrent/current-bench)\n  service will run your benchmarks through that.\n\n- You likely also want to have a [bench.Dockerfile](bench.Dockerfile) and\n  [.dockerignore](.dockerignore) at the root of the project. Make sure that the\n  Dockerfile is layered such that it will pickup opam updates when desired while\n  also avoiding unnecessary work during rebuilds.\n\n- You will also need the benchmarks and that is where this framework may help.\n  You can find examples of multicore benchmarks from the\n  [Saturn](https://github.com/ocaml-multicore/saturn/tree/main/bench),\n  [Kcas](https://github.com/ocaml-multicore/kcas/tree/main/bench), and\n  [Picos](https://github.com/ocaml-multicore/picos/tree/main/bench) projects and\n  from the [bench](bench) directory of this repository.\n\nFor multicore benchmarks you will also need to have\n[current-bench](https://github.com/ocurrent/current-bench) configured to use a\nmulticore machine, which currently needs to be done by the\n[current-bench](https://github.com/ocurrent/current-bench) maintainers.\n\n## Example: Benchmarking `Atomic.incr` under contention\n\nLet's look at a simple example with detailed comments of how one might benchmark\n`Atomic.incr` under contention.\n\nNote that this example is written here as a\n[MDX](https://github.com/realworldocaml/mdx) document or test. Normally you\nwould write a benchmark as a command line executable and would likely compile it\nin release mode with a native compiler.\n\nWe first open the\n[`Multicore_bench`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/index.html)\nmodule:\n\n```ocaml\n# open Multicore_bench\n```\n\nThis brings into scope multiple modules including\n[`Suite`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Suite/index.html),\n[`Util`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Util/index.html),\n[`Times`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Times/index.html),\nand\n[`Cmd`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Cmd/index.html)\nthat we used below.\n\nTypically one would divide a benchmark executable into benchmark suites for\ndifferent algorithms and data structures. To illustrate that pattern, let's\ncreate a module `Bench_atomic` for our benchmarks suite on atomics:\n\n```ocaml\n# module Bench_atomic : sig\n    (* The entrypoint to a suite is basically a function.  There is a type\n       alias for the signature. *)\n    val run_suite : Suite.t\n  end = struct\n    (* [run_one] runs a single benchmark with the given budget and number of\n       domains. *)\n    let run_one ~budgetf ~n_domains () =\n      (* We scale the number of operations using [Util.iter_factor], which\n         depends on various factors such as whether we are running on a 32- or\n         64-bit machine, using a native or bytecode compiler, and whether we are\n         running on multicore OCaml.  The idea is to make it possible to use the\n         benchmark executable as a test that can be run even on slow CI\n         machines. *)\n      let n = 10 * Util.iter_factor in\n\n      (* In this example, [atomic] is the data structure we are benchmarking. *)\n      let atomic =\n        Atomic.make 0\n        |\u003e Multicore_magic.copy_as_padded\n        (* We explicitly pad the [atomic] to avoid false sharing.  With false\n           sharing measurements are likely to have a lot of noise that makes\n           it difficult to get useful results. *)\n      in\n\n      (* We store the number of operations to perform in a scalable countdown\n         counter.  The idea is that we want all the workers or domains to work\n         at the same time as much as possible, because we want to measure\n         performance under contention.  So, instead of e.g. simply having each\n         domain run a fixed count loop, which could lead to some domains\n         finishing well before others, we let the number of operations performed\n         by each domain vary. *)\n      let n_ops_to_do =\n        Countdown.create ~n_domains ()\n      in\n\n      (* [init] is called on each domain before [work].  The return value of\n         [init] is passed to [work]. *)\n      let init _domain_index =\n        (* It doesn't matter that we set the countdown counter multiple times.\n           We could also use a [before] callback to do setup before [work]. *)\n        Countdown.non_atomic_set n_ops_to_do n\n      in\n\n      (* [work] is called on each domain and the time it takes is recorded.\n         The second argument comes from [init]. *)\n      let work domain_index () =\n        (* Because we are benchmarking operations that take a very small amount\n           of time, we run our own loop to perform the operations.  This has\n           pros and cons.  One con is that the loop overhead will be part of the\n           measurement, which is something to keep in mind when interpreting the\n           results.  One pro is that this gives more flexibility in various\n           ways. *)\n        let rec work () =\n          (* We try to allocate some number of operations to perform. *)\n          let n = Countdown.alloc n_ops_to_do ~domain_index ~batch:100 in\n          (* If we got zero, then we should stop. *)\n          if n \u003c\u003e 0 then begin\n            (* Otherwise we perform the operations and try again. *)\n            for _=1 to n do\n              Atomic.incr atomic\n            done;\n            work ()\n          end\n        in\n        work ()\n      in\n\n      (* [config] is a name for the configuration of the benchmark.  In this\n         case we distinguish by the number of workers or domains. *)\n      let config =\n        Printf.sprintf \"%d worker%s\" n_domains\n          (if n_domains = 1 then \"\" else \"s\")\n      in\n\n      (* [Times.record] does the heavy lifting to spawn domains and measure\n         the time [work] takes on them. *)\n      let times = Times.record ~budgetf ~n_domains ~init ~work () in\n\n      (* [Times.to_thruput_metrics] takes the measurements and produces both a\n         metric for the time of a single operation and for the total thruput\n         over all the domains. *)\n      Times.to_thruput_metrics ~n ~singular:\"incr\" ~config times\n\n    (* [run_suite] runs the benchmarks in this suite with the given budget. *)\n    let run_suite ~budgetf =\n      (* In this case we run the benchmark with various number of domains. We\n         use [concat_map] to collect the results as a flat list of outputs. *)\n      [ 1; 2; 4; 8 ]\n      |\u003e List.concat_map @@ fun n_domains -\u003e\n         run_one ~budgetf ~n_domains ()\n  end\nmodule Bench_atomic : sig val run_suite : Suite.t end\n```\n\nWe then collect all the suites into an association list. The association list\nhas a name and entry point for each suite:\n\n```ocaml\n# let benchmarks = [\n    (\"Atomic\", Bench_atomic.run_suite)\n  ]\nval benchmarks : (string * Suite.t) list = [(\"Atomic\", \u003cfun\u003e)]\n```\n\nUsually the list of benchmarks is in the main module of the benchmark executable\nalong with an invocation of\n[`Cmd.run`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Cmd/index.html#val-run):\n\n```ocaml non-deterministic\n# Cmd.run ~benchmarks ~argv:[||] ()\n{\n  \"results\": [\n    {\n      \"name\": \"Atomic\",\n      \"metrics\": [\n        {\n          \"name\": \"time per incr/1 worker\",\n          \"value\": 11.791,\n          \"units\": \"ns\",\n          \"trend\": \"lower-is-better\",\n          \"description\": \"Time to process one incr\",\n          \"#best\": 9.250000000000002,\n          \"#mean\": 12.149960000000002,\n          \"#median\": 11.791,\n          \"#sd\": 1.851061543655424,\n          \"#runs\": 25\n        },\n        {\n          \"name\": \"incrs over time/1 worker\",\n          \"value\": 84.81044864727335,\n          \"units\": \"M/s\",\n          \"trend\": \"higher-is-better\",\n          \"description\": \"Total number of incrs processed\",\n          \"#best\": 108.1081081081081,\n          \"#mean\": 84.25129565093134,\n          \"#median\": 84.81044864727335,\n          \"#sd\": 12.911113376793846,\n          \"#runs\": 25\n        },\n        // ...\n      ]\n    }\n  ]\n}\n- : unit = ()\n```\n\nBy default\n[`Cmd.run`](https://ocaml-multicore.github.io/multicore-bench/doc/multicore-bench/Multicore_bench/Cmd/index.html#val-run)\ninterprets command line arguments from\n[`Sys.argv`](https://v2.ocaml.org/api/Sys.html#VALargv). Unlike what one would\ntypically do, we explicitly specify `~argv:[||]`, because this code is being run\nthrough the [MDX](https://github.com/realworldocaml/mdx) tool.\n\nNote that the output above is just a sample. The timings are non-deterministic\nand will slightly vary from one run of the benchmark to another even on a single\ncomputer.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Focaml-multicore%2Fmulticore-bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Focaml-multicore%2Fmulticore-bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Focaml-multicore%2Fmulticore-bench/lists"}