{"id":18627478,"url":"https://github.com/yugr/uinit","last_synced_at":"2025-10-07T15:44:03.635Z","repository":{"id":129419507,"uuid":"74732919","full_name":"yugr/uInit","owner":"yugr","description":"Instructions on obtaining stable benchmarks results on modern Linux distro","archived":false,"fork":false,"pushed_at":"2023-09-02T12:50:02.000Z","size":6,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-25T09:52:59.254Z","etag":null,"topics":["benchmarking","init","jitter","ubuntu"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yugr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-25T06:49:10.000Z","updated_at":"2023-04-03T17:10:23.000Z","dependencies_parsed_at":"2024-11-07T04:42:47.404Z","dependency_job_id":"4e7215b3-2871-4808-9650-71e21e98908b","html_url":"https://github.com/yugr/uInit","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yugr%2FuInit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yugr%2FuInit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yugr%2FuInit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yugr%2FuInit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yugr","download_url":"https://codeload.github.com/yugr/uInit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248347467,"owners_count":21088658,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","init","jitter","ubuntu"],"created_at":"2024-11-07T04:42:33.260Z","updated_at":"2025-10-07T15:44:03.616Z","avatar_url":"https://github.com/yugr.png","language":"Shell","readme":"# What is this?\n\nThis document contains instructions for setting up systems with Ubuntu distros\nfor stable benchmarking.\n\nModern distributions (e.g. Ubuntu or Debian) have a lot of\nrunning daemons or services which eat precious CPU time\n(each context switch costs 2-3K cycles), pollute caches (D$, I$, TLB, BTB, etc.)\nand steal DRAM bandwidth (this is collectively called \"OS jitter\").\nIf that's not enough, hyperthreading and dynamic frequency scaling (DVFS) also add to the jitter\nso in practice you can see up to 5% noise in benchmark runs (e.g. SPEC2000)\nwhich prevents reliable performance comparisons (a typical compiler\noptimization for general-purpose CPU may yield around 2-3% improvement).\n\nNote that recommendations in this guide are for obtaining stable,\nbut not necessarily the fastest or even \"realistic\", measurements.\nAlso they are most applicable to SPEC-like\n(userspace, CPU-bound) benchmarks.\n\nI also don't cover multithreading-specific jitter\n(e.g. caused by [false sharing](https://www.reddit.com/r/rust/comments/17z7eha/false_sharing_can_happen_to_you_too/)\nor priority inversion).\n\n# Ways to reduce noise\n\nObviously you should run benchmarks on real hardware\n(not cloud server, virtual machine, Docker or WSL).\n\nTo obtain more or less stable measurements (std. deviation less than 0.5%), you'll also need to\n* disable non-deterministic HW features in BIOS\n* reserve CPU cores for benchmarking (so that OS never touches them)\n* boot in non-GUI mode\n* disable ASLR and other funky OS features which hurt stability\n* execute your benchmark in special high-perf mode (increased priority, etc.)\n\n## Basics\n\nThis section contains some really basic (but sadly often overlooked) recommendations.\n\nIf benchmark prints a lot of output to stdout/stderr,\nit should be redirected to file (or `/dev/null`).\n\nBefore measuring the time prefer to do several warmup runs\nto populate OS file cache and L1i.\n\nFix any random numbers in benchmarks that may influence program flow.\n\nTry to measure all compared versions of benchmark on the same day\n(differences in environment temperatures may toggle slightly different\nhardware frequency scaling).\n\nFinally, avoid running other programs in parallel with benchmark\n(including other benchmarks on separate cores, `top`/`htop`, etc.).\nIn particular, disable cronjobs.\n\n## BIOS settings\n\nDisable frequency scaling (sometimes called \"power save mode\", \"turbo mode\", \"perf boost\", etc.), hyperthreading, HW prefetching, Turbo Boost, Intel Speed Shift, etc. in BIOS settings.\n e.g. hyperthreading and frequency scaling ; note that disabling them in kernel won't work for all kernel versions so BIOS is preferred\n\n## Run in non-GUI mode\n\nTo boot to non-GUI mode on systemd systems see https://linuxconfig.org/how-to-disable-enable-gui-in-ubuntu-22-04-jammy-jellyfish-linux-desktop\n(`sudo init 1` otherwise).\n\n## Disable unnecessary services\n\nEven non-GUI multi-user mode modern distros will execute a lot of services\nwhich may distort your measurements.\n\nFor systemd distros a list of active services can be obtained via\n```\n$ systemctl list-units --all\n```\n(look for anything `active` or `waiting`).\n\nYou can identify services which actually cause problems on _your_ system by running\n```\n$ script -qefc 'top -cbd3' \u003e top.log\n# Wait for several hours\n$ grep -A2 CPU top.log | awk '{$1=$2=$3=$4=$5=$6=$7=$8=$10=$11=\"\"; print $0}' | grep -v 'CPU\\|^$' | sed 's/^ *//'\n```\nand then disable them via `sudo systemctl stop ...`.\n\nOn typical Ubuntu desktop I suggest disabling at least\n```\n$ systemctl stop apt-daily* unattended-upgrades* update-notifier* fwupd* snapd* irqbalance* {systemd-oomd,udisks2,polkit}.service\n```\n\nTODO: avahi-daemon ?\n\n## Reserve cores for benchmarking\n\nAdd to `/etc/default/grub`:\n```\n# rcu_nocbs requires kernel built with CONFIG_RCU_NOCB_CPU and\n# nohz_full requires CONFIG_NO_HZ_FULL\n# (default Ubuntu kernels seem to have both)\nGRUB_CMDLINE_LINUX_DEFAULT=\"nohz_full=8-15 kthread_cpus=0-7 rcu_nocbs=8-15 irqaffinity=0-7 isolcpus=nohz,managed_irq,8-15\"\n```\n(this assumes that cores 0-7 are left to system and 8-15 are reserved for benchmarks).\n\nThe run\n```\n$ sudo update-grub\n```\nand reboot.\n\nYou can now use `taskset 0xff00 ...` to run benchmarks on reserved cores.\n\nWhen selecting cores to reserve, use `lstopo` to check that\nreserved cores do not share L2/L3 with unreserved ones.\n\nWARNING: if benchmark runs more threads than isolated cores\n(which often happens to Rust programs because they rely on `std::thread::available_parallelism`\nwhich ignores affinity settings)\nNOISE INCREASES 10x (reproduced on Ubuntu 22 kernel 6.8.0-60-generic).\n\nAdditional isolation of cores can be achieved by adding `domain` to\n`isolcpus` setting above but I do not recommend this because\nkernel will then [schedule all benchmark threads on single core](https://serverfault.com/questions/573025/taskset-not-working-over-a-range-of-cores-in-isolcpus)\n([reproduced on Ubuntu 22 and 24](https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2116749)).\n\nThis can be worked around by using `SCHED_FIFO` or `SCHED_RR` policies.\n\n## Use benchmark-friendly scheduling policy\n\n`SCHED_FIFO` scheduling policy may result in fewer interrupts when running benchmarks.\n\nTo enable it, add permissions to change scheduling policy for ordinary users:\n```\n$ sudo setcap cap_sys_nice=ep /usr/bin/chrt\n```\n\nYou can then use `chrt -f 1 ...` to enable `SCHED_FIFO`.\n\n## Fix frequency\n\nSet scaling governor to `performance` via\n```\n# Give CPU startup routines time to settle\n$ sleep 120\n$ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor\n```\n\nNote that in some situations frequency scaling may still be enforced\nat hardware level to avoid overheating the CPU\n(as e.g. in the [infamous AVX-512 case](https://news.ycombinator.com/item?id=23824053)).\n\n## Increase priority\n\nAdd to `/etc/security/limits.conf`:\n```\nUSERNAME soft nice -20\nUSERNAME hard nice -20\n```\nand run benchmarks under `nice -n -20 ...`.\n\n## Disable ASLR\n\nRun benchmarks under `setarch -R ...`.\n\n## Fix start address of stack\n\nOn POSIX systems memory segment that is used for stack is first filled\nwith environment variables, then padded to OS-specific alignment and\nfinally the aligned address is used as start of the program stack.\n\nBecause of this addresses of program's local variables\nmay change depending on environment variables even if you've disabled ASLR.\nSome variables, like `$PWD` or `$_`, may vary across benchmark invocations\nand influence results (5% fluctuations are not uncommon for microbenchmarks).\n\nIt is thus strongly recommended to run benchmarks that do not rely on\nenvironment under `env -i`\n(`hyperfine` runs benchmarks with [randomized environment](https://github.com/sharkdp/hyperfine/issues/235)\nfor this reason).\n\n## Fix code layout\n\nDue to [intricacies of modern CPU frontends](https://www.bazhenov.me/posts/2024-02-performance-roulette/)\n(also [here](https://easyperf.net/blog/2018/01/18/Code_alignment_issues))\nsome benchmarks may be sensitive to particular code layout\n(i.e. offsets of functions and basic blocks)\nproduced by compiler and linker. This layout can vary due to unrelated changes\n(e.g. [changing order in which object files are passed to linker](https://dl.acm.org/doi/10.1145/1508244.1508275))\nand complicate performance comparisons.\n\nAlthough not a full solution, it's recommended to compile C/C++ code with\n`-falign-functions=64` (`-Z min-function-alignment=64` for Rust).\nLoops may also need to be aligned to fetchline size (via `-falign-loops=N`).\n\n# Running SPEC benchmarks\n\n`Taskset`, `nice`, etc. can be set in `monitor_specrun_wrapper` in SPEC config:\n```\nmonitor_specrun_wrapper = chrt -f 1 taskset 0xff00 nice -n -20 setarch -R \\$command\n```\n\n# Further work\n\nAbove instructions allow to achieve \u003c0.5% noise which is usually enough in practice.\n\nIf you want to lower this further, here are some suggestions\n(I haven't tried them myself though):\n* turn off network via\n\n    ```\n    # Ubuntu (note that change is permanent)\n    $ nmcli networking off\n\n    # Didn't work for me\n    $ sudo /etc/init.d/networking stop\n    $ systemctl stop networking.service\n    ```\n* [boot to single-user mode](https://askubuntu.com/questions/132965/how-do-i-boot-into-single-user-mode-from-grub)\n  - this would also disable networking\n* run from ramdisks\n* examine various platform settings in https://www.spec.org/cpu2006/flags/\n* experiment with performance-related BIOS settings\n* enable Huge Pages in kernel\n* use `numactl` to control NUMA affinity\n* disable thread migration\n* disable returning memory to system in Glibc\n  - `export M_MMAP_MAX=0 M_ARENA_MAX=1 M_TRIM_THRESHOLD=-1`\n* disable irqbalancer (`IRQBALANCE_BANNED_CPULIST`)\n* disable watchdogs\n  - `nmi_watchdog=0 nowatchdog nosoftlockup`\n* disable kernel hardening (`pti=off`, etc.)\n\n# Additional readings\n\n* Technical Itch: Reducing system jitter: [part 1](https://epickrram.blogspot.com/2015/09/reducing-system-jitter.html)\n  and [part 2](https://epickrram.blogspot.com/2015/11/reducing-system-jitter-part-2.html)\n* [Interference-free Operating System](https://arxiv.org/abs/2412.18104)\n* [LinuxCNC latency and jitter improvements with PREEMPT_RT kernel parameter tuning](https://dantalion.nl/2024/09/29/linuxcnc-latency-jitter-kernel-parameter-tuning.html)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyugr%2Fuinit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyugr%2Fuinit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyugr%2Fuinit/lists"}