{"id":13875948,"url":"https://github.com/gz/autoperf","last_synced_at":"2026-02-11T10:33:37.902Z","repository":{"id":34647282,"uuid":"66655484","full_name":"gz/autoperf","owner":"gz","description":"Simplify the use of performance counters.","archived":false,"fork":false,"pushed_at":"2022-04-25T03:01:15.000Z","size":4618,"stargazers_count":62,"open_issues_count":2,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-24T03:31:34.395Z","etag":null,"topics":["intel","perf","performance","performance-counters","performance-metrics","performance-monitoring","performance-visualization","x86","x86-64"],"latest_commit_sha":null,"homepage":"https://docs.rs/autoperf","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-08-26T14:47:48.000Z","updated_at":"2024-11-09T14:19:12.000Z","dependencies_parsed_at":"2022-08-08T01:16:04.560Z","dependency_job_id":null,"html_url":"https://github.com/gz/autoperf","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gz%2Fautoperf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gz%2Fautoperf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gz%2Fautoperf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gz%2Fautoperf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gz","download_url":"https://codeload.github.com/gz/autoperf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227961891,"owners_count":17847836,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["intel","perf","performance","performance-counters","performance-metrics","performance-monitoring","performance-visualization","x86","x86-64"],"created_at":"2024-08-06T06:00:52.013Z","updated_at":"2026-02-11T10:33:37.895Z","avatar_url":"https://github.com/gz.png","language":"Rust","readme":"[![Build Status](https://travis-ci.org/gz/autoperf.svg)](https://travis-ci.org/gz/autoperf)\n[![Crates.io](https://img.shields.io/crates/v/autoperf.svg)](https://crates.io/crates/autoperf) \n[![docs.rs/autoperf](https://docs.rs/autoperf/badge.svg)](https://docs.rs/crate/autoperf/)\n\n\n# autoperf\n\nautoperf simplifies the instrumentation of programs with performance\ncounters on Intel machines. Rather than trying to learn how to measure every\nevent and manually programming event values in counter registers or perf, you\ncan use autoperf which will repeatedly run your program until it has measured\nevery single performance event on your machine. autoperf tries to compute a\nschedule that maximizes the amount of events measured per run, and\nminimizes the total number of runs while avoiding multiplexing of events on\ncounters.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://gz.github.io/autoperf/doc/intro.svg\" width=\"90%\"\u003e\n\u003c/p\u003e\n\n\u003cbr /\u003e\n\u003cimg align=\"right\" src=\"https://gz.github.io/autoperf/doc/counters_vs_events.png\" width=\"45%\"\u003e\n\n## Background\n\nPerformance monitoring units typically distinguish between performance events and counters. \nEvents refer to observations on the micro-architectural level \n(e.g., a TLB miss, a page-walk etc.), whereas counters are hardware registers that \ncount the occurrence of events. The figure on the right shows the number of different \nobservable events for different Intel micro-architectures. Note that current systems \nprovide a very large choice of possible events to monitor. The number of measurable \ncounters per PMU is limited (typically from two to eight). For example, if the same \nevents are measured on all PMUs on a SkylakeX (Xeon Gold 5120) machine, we can only \nobserve a maximum of 48 different events (without sampling). autoperf simplifies the process \nof fully measuring and recording every performance event for a given program.\nIn our screen session above, recorded on a SkylakeX machine with ~3500 distinct events, \nwe can see how autoperf automatically runs a program 1357 times while measuring and recording \na different set of events in every run.\n\u003cbr clear=\"right\"/\u003e\n\n# Installation\n\nautoperf is known to work with Ubuntu 18.04 on Skylake and\nIvyBridge/SandyBridge architectures. All Intel architectures should work,\nplease file a bug request if it doesn't. autoperf builds on `perf` from the\nLinux project and a few other libraries that can be installed using:\n\n```\n$ sudo apt-get update\n$ sudo apt-get install likwid cpuid hwloc numactl util-linux\n```\n\nTo run the example analysis scripts, you'll need these python3 libraries:\n```\n$ pip3 install ascii_graph matplotlib pandas argparse numpy\n```\n\nYou'll also need the *nightly version* of the rust compiler which is \nbest installed using rustup:\n```\n$ curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain nightly\n$ source $HOME/.cargo/env\n```\n\nautoperf is published on crates.io, so once you have rust and cargo installed, \nyou can get it directly from there:\n```\n$ cargo +nightly install autoperf\n```\n\nOr alternatively, clone and build the repository yourself:\n```\n$ git clone https://github.com/gz/autoperf.git\n$ cd autoperf\n$ cargo build --release\n$ ./target/release/autoperf --help\n```\n\nautoperf uses perf internally to interface with Linux and the performance\ncounter hardware. perf recommends that the following settings are disabled.\nTherefore, autoperf will check the values of those configurations and refuse to\nstart if they are not set like below:\n```\nsudo sh -c 'echo 0 \u003e\u003e /proc/sys/kernel/kptr_restrict'\nsudo sh -c 'echo 0 \u003e /proc/sys/kernel/nmi_watchdog'\nsudo sh -c 'echo -1 \u003e /proc/sys/kernel/perf_event_paranoid'\n```\n\n# Usage\n\nautoperf has a few commands, use `--help` to get a better overview of all the\noptions.\n\n## Profiling\n\nThe **profile** command instruments a single program by running it multiple times\nuntil every performance event is measured. For example,\n```\n$ autoperf profile sleep 2\n```\nwill repeatedly run `sleep 2` while measuring different performance events \nwith performance counters every time. Once completed, you will find an `out`\nfolder with many csv files that contain measurements from individual runs.\n\n## Aggregating results\n\nTo combine all those runs into a single CSV result file you can use the\n**aggregate** command: \n```\n$ autoperf aggregate ./out\n``` \nThis will do some sanity checking and produce a `results.csv` \n([reduced example](../master/doc/results.csv)) file which contains \nall the measured data.\n\n## Analyze results\n\nPerformance events are measured individually on every core (and other\nmonitoring units). The `timeseries.py` can aggregate events by taking the\naverage, stddef, min, max etc. and producing a time-series matrix ([see a\nreduced example](../master/doc/timeseries.csv)).\n\n```\npython3 analyze/profile/timeseries.py ./out\n```\n\nNow you have all the data, so you can start asking some questions. As an\nexample, the following script tells you how events were correlated\nwhen your program was running:\n\n```\n$ python3 analyze/profile/correlation.py ./out\n$ open out/correlation_heatmap.png\n```\n\nEvent correlation for the `autoperf profile sleep 2` command\nabove looks like this (every dot represents the correlation of the timeseries \nbetween two measured performance events, this is from a Skylake machine with\naround 1700 non-zero event measurement):\n![Correlation Heatmap](/doc/correlation_heatmap.png)\n\nYou can look at individual events too:\n```\npython3 analyze/profile/event_detail.py --resultdir ./out --features AVG.OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HIT_FORWARD\n```\n![Plot events](/doc/perf_event_plot.png)\n\nThere are more scripts in the `analyze` folder to better work with the captured \ndata-sets. Have a look.\n\n## What do I use this for?\n\nautoperf allows you to quickly gather lots of performance (or training) data and\nreason about it quantitatively. For example, we initially developed autoperf to\nbuild ML classifiers that the Barrelfish scheduler could use for detecting\napplication slowdown and make better scheduling decisions. autoperf can gather\nthat data to generate such classifiers without requiring domain knowledge about \nevents, aside from how to measure them.\n\nYou can read more about our experiments here:\n\n* https://dl.acm.org/citation.cfm?id=2967360.2967375 \n* https://www.research-collection.ethz.ch/handle/20.500.11850/155854\n\nLast but not least, autoperf can potentially be useful in many other scenarios:\n * Find out what performance events are relevant for your workload\n * Analyzing and finding performance issues in your code or with different versions of your code\n * Generate classifiers to detect hardware exploits (side channels/spectre/meltdown etc.)\n * ...\n","funding_links":[],"categories":["Rust","performance"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgz%2Fautoperf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgz%2Fautoperf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgz%2Fautoperf/lists"}