{"id":28391824,"url":"https://github.com/morganstanley/xpedite","last_synced_at":"2026-03-04T07:33:25.010Z","repository":{"id":45886382,"uuid":"139175809","full_name":"morganstanley/Xpedite","owner":"morganstanley","description":"A non-sampling profiler purpose built to measure and optimize performance of C++ low latency/real time systems","archived":false,"fork":false,"pushed_at":"2026-02-17T18:16:20.000Z","size":525209,"stargazers_count":182,"open_issues_count":1,"forks_count":49,"subscribers_count":21,"default_branch":"main","last_synced_at":"2026-02-17T23:33:02.529Z","etag":null,"topics":["benchmarking","cpu-profiler","intel","jupyter","jupyter-notebook","low-latency","perf-events","performance","performance-counters","profiler","real-time","ultra-low-latency"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/morganstanley.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE.md","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-06-29T17:07:46.000Z","updated_at":"2026-02-17T18:16:26.000Z","dependencies_parsed_at":"2024-06-10T16:01:59.275Z","dependency_job_id":"e1aedcdb-f476-4143-a401-e483edcfbfdb","html_url":"https://github.com/morganstanley/Xpedite","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/morganstanley/Xpedite","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morganstanley%2FXpedite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morganstanley%2FXpedite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morganstanley%2FXpedite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morganstanley%2FXpedite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/morganstanley","download_url":"https://codeload.github.com/morganstanley/Xpedite/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/morganstanley%2FXpedite/sbom","scorecard":{"id":1243525,"data":{"date":"2026-02-17T18:16:49Z","repo":{"name":"github.com/morganstanley/Xpedite","commit":"cda490449d9058a7bf14128e57031f72b0e0b981"},"scorecard":{"version":"v5.1.1","commit":"cd152cb6742c5b8f2f3d2b5193b41d9c50905198"},"score":4.4,"checks":[{"name":"Maintained","score":3,"reason":"4 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 3","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#packaging"}},{"name":"Code-Review","score":10,"reason":"all changesets reviewed","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/main.yml:1","Info: topLevel permissions set to 'read-all': .github/workflows/scorecards.yml:18","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":9,"reason":"binaries present in source code","details":["Warn: binary detected: jni/jar/javassist.jar:1"],"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#cii-best-practices"}},{"name":"Vulnerabilities","score":0,"reason":"14 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2021-856 / GHSA-5545-2q6w-2gh6","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: PYSEC-2019-108 / GHSA-9fq2-x9r6-wfmf","Warn: Project is vulnerable to: PYSEC-2021-857 / GHSA-f7c7-j99h-c22f","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: PYSEC-2021-140 / GHSA-9w8r-397f-prfh","Warn: Project is vulnerable to: PYSEC-2016-32 / GHSA-fff8-4w9p-7v76","Warn: Project is vulnerable to: PYSEC-2023-117 / GHSA-mrwq-x4v8-fh7p","Warn: Project is vulnerable to: PYSEC-2021-141 / GHSA-pq64-v7f5-gqh8","Warn: Project is vulnerable to: PYSEC-2024-44 / GHSA-h5cg-53g7-gqjw","Warn: Project is vulnerable to: PYSEC-2013-22 / GHSA-27x4-j476-jp5f","Warn: Project is vulnerable to: PYSEC-2025-49 / GHSA-5rjg-fvgr-3xxf","Warn: Project is vulnerable to: GHSA-cx63-2mw6-8hw5","Warn: Project is vulnerable to: PYSEC-2022-43012 / GHSA-r9hx-vwmv-q579"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE.md:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE.md:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#license"}},{"name":"Pinned-Dependencies","score":3,"reason":"dependency not pinned by hash detected -- score normalized to 3","details":["Info: Possibly incomplete results: error parsing shell code: \"foo(\" must be followed by ): CMakeLists.txt:0","Info: Possibly incomplete results: error parsing shell code: \"foo(\" must be followed by ): jni/CMakeLists.txt:0","Info: Possibly incomplete results: error parsing shell code: \"foo(\" must be followed by ): vivify/CMakeLists.txt:0","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/main.yml:27: update your workflow using https://app.stepsecurity.io/secureworkflow/morganstanley/Xpedite/main.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/main.yml:30: update your workflow using https://app.stepsecurity.io/secureworkflow/morganstanley/Xpedite/main.yml/main?enable=pin","Warn: containerImage not pinned by hash: .github/actions/Dockerfile:1: pin your Docker image by updating quay.io/pypa/manylinux2010_x86_64 to quay.io/pypa/manylinux2010_x86_64@sha256:3b5eb5ab9bc73b93740ae4eda9962078951cd3bb7efe837b770360e78c7366fb","Warn: pipCommand not pinned by hash: .github/workflows/main.yml:70","Info:   3 out of   5 GitHub-owned GitHubAction dependencies pinned","Info:   1 out of   1 third-party GitHubAction dependencies pinned","Info:   0 out of   1 containerImage dependencies pinned","Info:   0 out of   1 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#pinned-dependencies"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":6,"reason":"branch protection is not maximal on development and all release branches","details":["Info: 'allow deletion' disabled on branch 'main'","Info: 'force pushes' disabled on branch 'main'","Info: 'branch protection settings apply to administrators' is required to merge on branch 'main'","Info: 'stale review dismissal' is required to merge on branch 'main'","Warn: required approving review count is 1 on branch 'main'","Warn: codeowners review is required - but no codeowners file found in repo","Info: 'last push approval' is required to merge on branch 'main'","Warn: no status checks found to merge onto branch 'main'","Info: PRs are required in order to make changes on branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#branch-protection"}},{"name":"Dependency-Update-Tool","score":0,"reason":"no update tool detected","details":["Warn: no dependency update tool configurations found"],"documentation":{"short":"Determines if the project uses a dependency update tool.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#dependency-update-tool"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 30 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#sast"}},{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: github.com/morganstanley/.github/SECURITY.md:1","Info: Found linked content: github.com/morganstanley/.github/SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: github.com/morganstanley/.github/SECURITY.md:1","Info: Found text in security policy: github.com/morganstanley/.github/SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#security-policy"}},{"name":"Contributors","score":0,"reason":"project has 0 contributing companies or organizations -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project has a set of contributors from multiple organizations (e.g., companies).","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#contributors"}},{"name":"CI-Tests","score":2,"reason":"4 out of 14 merged PRs checked by a CI test -- score normalized to 2","details":null,"documentation":{"short":"Determines if the project runs tests before pull requests are merged.","url":"https://github.com/ossf/scorecard/blob/cd152cb6742c5b8f2f3d2b5193b41d9c50905198/docs/checks.md#ci-tests"}}]},"last_synced_at":"2026-02-17T23:33:59.011Z","repository_id":45886382,"created_at":"2026-02-17T23:33:59.011Z","updated_at":"2026-02-17T23:33:59.011Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30075436,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T05:31:57.858Z","status":"ssl_error","status_checked_at":"2026-03-04T05:31:38.462Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","cpu-profiler","intel","jupyter","jupyter-notebook","low-latency","perf-events","performance","performance-counters","profiler","real-time","ultra-low-latency"],"created_at":"2025-05-31T10:38:32.010Z","updated_at":"2026-03-04T07:33:24.985Z","avatar_url":"https://github.com/morganstanley.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Apache License](https://img.shields.io/badge/license-Apache-yellow.svg)](https://raw.githubusercontent.com/Morgan-Stanley/Xpedite/master/LICENSE.md)\n[![Build Status](https://github.com/morganstanley/Xpedite/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/morganstanley/Xpedite/actions/workflows/main.yml)\n[![codecov](https://codecov.io/gh/morganstanley/Xpedite/branch/main/graph/badge.svg?token=25wU3VJLky)](https://codecov.io/gh/MorganStanley/Xpedite)\n\n# Xpedite\n\nA non-sampling profiler, purpose built to measure and optimise, performance of ultra-low-latency / real time systems.\n\nThe main features include\n  \n  1. **Targeted Profiling** - Quantify how efficiently a section of code, runs in any Intel CPU and how much head room is left for further optimizations.\n  2. **PMU counters** - Capture hundreds of processor specific performance counters like cache/TLB misses, CPU Stalls, NUMA remote access, context switches, etc.\n  3. **Cycles accounting** - Find percent of retiring vs stalled cpu cycles with [Topdown micro architecture analysis](https://ieeexplore.ieee.org/document/6844459).\n  4. **Optimization heuristics** - Narrow down cpu bottlenecks, with potential for maximum application speed up.\n  5. **Analytics \u0026 visualization** - Provides a [Jupiter](http://jupyter.org) shell for interactive drill down and visualization of performance metrics and bottlenecks.\n  6. **Regression detection** - Benchmark multiple releases/builds side-by-side, to detect and prevent regression across releases.\n\n# Why yet another profiler ?\nXpedite grew in the world of automated low latency trading systems, where latency directly translates to profitability. Such trading systems typically never relinquish cpu, but rather spin in a tight loop always looking for external events. Eventually when an event is detected, the engine would need to react in a few microseconds.\n    \nIn cases where events occur less frequently, the amount of time spent waiting, far exceeds the time spent by the critical path reacting. Profiling such low-latency systems becomes a real challenge, with off the shelf sampling profilers like linux perf or Intel vtune.\n    \nSampling profilers, as the name implies, will sample timestamps and performance counters from the cpu, based on some counter firing frequency. For trading systems, such a sample set will be dominated by samples from the wait loop, rather than from the critical path, which reacts to events. \n    \nDue to this, sampling profilers typically end up profiling the least interesting part of the code, while ignoring the critical path. What we really need is an intrusive profiler, which can target and collect samples only during execution of desired critical path(s). Xpedite is primarily built to optimize real time systems of the nature described above.\n\n# Quick Start\n\n|section                                            |description                                                             |\n|---------------------------------------------------|:-----------------------------------------------------------------------|\n|[Building](#building)                              |Build and install xpedite                                               |\n|[Instrumentation](#instrumentation)                |Instrument C++ programs to identify profiling targets                   |\n|[Profiler Initialisation](#profilerInit)           |Enable profiling by initialising profiling framework                    |\n|[Profiling](#profiling)                            |Attach to a live process and collect performance statistics             |\n|[Xpedite Shell](#xpediteShell)                     |A shell to explore performance statistics                               |\n|[Benchmarking](#benchmarking)                      |View profiles of multiple builds side by side                           |\n|[Hardware performance counters](#pmc)              |Collect and visualise cpu hardware performance counters                 |\n|[Cycle Accounting](#cycleAccounting)               |Cycle accounting using topdown micro architecture analysis methodology  |\n|[Collaboration](#collaboration)                    |Collaborate in troubleshooting performance bottlenecks                  |\n|[Quality of life features](#qolFeatures)           |Lazy to write your own profile info ? why not auto generate it ?        |\n|[Support](#support)                                |Need help ?                                                             |\n|[Acknowledgements](#acknowledgements)              |Thanks to our contributors                                              |\n\n## Building \u003ca name=\"building\"\u003e\u003c/a\u003e\n\nTo build xpedite, you will need a linux machine (kernel 2.5 or later) running on intel hardware with the following packages.\n  1. [cmake](http://cmake.org/) (3.4 or later)\n  2. [GNU gcc](https://gcc.gnu.org/) (5.2 or later)\n  3. [python 3](https://www.python.org/downloads/)\n  4. [pybind11](https://github.com/pybind/pybind11)\n  5. [python 3 venv](https://docs.python.org/3/library/venv.html)\n\nWith the above installed, to use xpedite, clone this repository and run the following from xpedite source dir:\n\n```\n$ ./build.sh      # builds xpedite c++ library\n\n$ ./install.sh    # creates a virtual environment and installs xpedite python dependencies\n\n$ alias xpedite=\"PATH=`pwd`/install/runtime/bin `pwd`/scripts/bin/xpedite\" # Adds an alias for xpedite to shell\n\n```\n\nThe build process will produce a static library, `libxpedite.a`, which can be linked into a C++ executable \n(if you want to use xpedite in a position independent executable, the build produces a different static library to use, `libxpedite-pie.a`).\n\nIn addition, the build process will produce a demo program, `xpediteDemo`. This program is a hello world example of - How to profile a c++ program with xpedite.\n  \nThe demo source also serves as working example of xpedite instrumentation and profiling. To see it in action, run ```demo/demo.sh```\n\n\n## Instrumentation \u003ca name=\"instrumentation\"\u003e\u003c/a\u003e\n\nXpedite is an intrusive probe based profiler. Profiling starts with careful instrumentation of application code with probes.\nLet's consider how to instrument a simple C++ program.\n\n```c++\n  #include \u003ciostream\u003e\n  #include \u003cxpedite/framework/Probes.H\u003e\n\n  void eat()   { std::cout \u003c\u003c \"eat...\"   \u003c\u003c std::endl; }\n  void sleep() { std::cout \u003c\u003c \"sleep...\" \u003c\u003c std::endl; }\n  void code()  { std::cout \u003c\u003c \"code...\"  \u003c\u003c std::endl; }\n\n  void life(int timeToLive_) {\n    for(unsigned i=0; i\u003ctimeToLive_; ++i) {\n      XPEDITE_TXN_SCOPE(Life);\n      eat();\n\n      XPEDITE_PROBE(SleepBegin);\n      sleep();\n\n      XPEDITE_PROBE(CodeBegin);\n      code();\n    }\n  }\n```\n\nFirst identify sections of code, that are of interest.  In the above program, we would like to profile the following\n\n1. Total time taken by each iteration of the for loop\n2. A break up of time spent in eat(), sleep() and code() for each iteration.\n\nWe instrument the code by inserting probes (```XPEDITE_TXN_SCOPE```,  ```XPEDITE_PROBE```), so that sections of code,  are encapsulated by a pair of probes.\n\n\n## Profiler Initialization \u003ca name=\"profilerInit\"\u003e\u003c/a\u003e\n\nNext to enable profiling, add a call to `xpedite::framework::initialize(...)`, with a file system path as argument.\nThe initialize method when invoked, will store AppInfo (data about instrumented process) in the supplied filesystem path.\n    \nThe AppInfo file forms the link between Xpedite profiler and a running instance of an application.\nGiven a valid AppInfo file, the profiler can, attach and profile the process that created the file.\n\n```c++\n    #include \u003cstdexcept\u003e\n    #include \u003cxpedite/framework/Framework.H\u003e\n\n    int main() {\n      if(!xpedite::framework::initialize(\"/tmp/xpedite-appinfo.txt\")) {\n        throw std::runtime_error {\"failed to init xpedite\"}; \n      }\n      life(100);\n    }\n```\n\nIn the above program, the AppInfo is placed at filesystem path \"/tmp/xpedite-appinfo.txt\".\nThe second argument is an optional boolean parameter, to make the application wait, till a profiler attaches to the process.\n\nFinally, To build this simple program, add the two snippets above to a file called ```Life.C``` and run\n```\n$ g++ -pthread -std=c++11 -I \u003cpath-to-xpedite-headers\u003e Life.C -o life -L \u003cpath-to-xpedite-libs\u003e -lxpedite -ldl -lrt\n```\nThe explicit path statements may not be necessary, depending on where/how xpedite have been installed on your system.\n\n## Profiling \u003ca name=\"profiling\"\u003e\u003c/a\u003e\n\nXpedite probes start as 5 byte NOP instructions and have near zero overhead during normal execution.\nThe profiler when attached to a process,  can activate all or a subset of probes, for collection of timing and hardware counters.\nAll profile parameters, including path to AppInfo, list of probes etc ... are specified in a python module called \"profileInfo.py\".\n\nLet's see an example of a minimalistic profileInfo, that can be used to profile the above program.\n\n```python\n  from xpedite import Probe, TxnBeginProbe, TxnEndProbe\n  appName = 'Life'                      # Name of the application\n  appHost = '127.0.0.1'                 # Host, where the application is running\n  appInfo = '/tmp/xpedite-appinfo.txt'  # Path of the appinfo file\n\n  probes = [\n    TxnBeginProbe('Life Begin', sysName = 'LifeBegin'),     # Txn begins with probe 'LifeBegin', marks the beginning of 'eat()'\n    Probe('Code Begin', sysName = 'CodeBegin'),             # Marks the beginning of 'code()'\n    Probe('Sleep Begin', sysName = 'SleepBegin'),           # Marks the beginning of 'sleep()'\n    TxnEndProbe('Life End', sysName = 'LifeEnd'),           # Txn ends with probe 'LifeEnd', marks the end of 'sleep()'\n  ]\n```\n\nWith the above code stored in the file ```profileInfo.py```, let's build a profile by running the application, followed by attaching xpedite to it.\n  \nIf you followed all the steps, listed in the previous sections, by now you would have built a program called \"life\". \nStarting the program from a command line will, print messages similar to ones shown below.\n\n```\n  1 info Tue Mar  6 14:27:13 2018 Listener xpedite [fd - 4 | ip - 0.0.0.0 | port - 0 | mode - NON-Blocking binding to port 0\n  2 info Tue Mar  6 14:27:13 2018 Listener xpedite [fd - 4 | ip - 0.0.0.0 | port - 57474 | mode - NON-Blocking listening for incoming connections 33504\n  3 info Tue Mar  6 14:27:18 2018 xpedite - accepted incoming connection from ip - 127.0.0.1 | port - 64538 | fd - 3 | mode - NON-Blocking\n  ...\n```\n\nTo attach the profiler to the running process, invoke ```xpedite record -p profileInfo.py -H 1``` from console.\nIf everything works, the profiler after collecting profile data from the target process, will start an instance of jupyter.\n  \nOpen the url in console, with a web browser, to see the results.\n\n\n```\nextracting counters for thread 61843 from file /dev/shm/xpedite-life-1520365531-61843.data -\u003e  completed in 0.03 sec.\n...\ngenerating notebook life-2018-03-06-14:45:31.ipynb -\u003e\n completed 5.4 KiB in 0.08 sec.\n [I 14:45:34.331 NotebookApp] Writing notebook server cookie secret to /tmp/xpediteShell0Uu2xn/.local/share/jupyter/runtime/notebook_cookie_secret\n ...\n [I 14:45:34.533 NotebookApp] The IPython Notebook is running at: http://10.198.37.19:8889/\n [I 14:45:34.533 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).\n```\n\n## Xpedite Shell \u003ca name=\"xpediteShell\"\u003e\u003c/a\u003e\n\nBrowsing the url, opens a page with a list of xpedite reports (files with .ipynb extension).\nYour page may contain, just one report as shown below.\n\n![alt text](docs/images/reportHomePage.png \"Xpedite report - Home Page\")\n\nClicking on the ipynb link, opens a interactive shell, with a summary of the profile.\n\n![alt text](docs/images/reportShellHeader.png \"Performance Analytics Shell Header\")\n\nBelow the summary, the report will include a histogram for latency distribution, plotting time in x-axis and transaction count in y-axis.\n  \nThis visualisation is a good starting point for getting an overview of application's latency profile.\n\n![alt text](docs/images/reportShellHistogram.png \"Xpedite report - Histogram\")\n\n\nThe shell also supports commands to filter, query and dynamically generate other visualisations to analyze profile data.\n  \nTo run a new command\n\n  1. First add an empty cell by clicking ```+``` button next to save file icon or pressing hot key ```b```\n  2. Enter a xpedite shell commands or any valid python snippet\n  3. To finally run the command, click the play icon or press hot key ```shift + enter```.\n\n  \nFor instance, running xpedite command ```txns()``` will print a table with 100 transactions (as shown below).\n    \n\nBefore you go any further, Let's take a moment to get familiar with a core xpedite concept.\n\n#### Transaction\n\nTransaction is any functionality in the program, that is a meaningful target for profiling and optimisations.\n  \nA transaction stores data (timestamps and h/w counters) from a collection of probes, that got hit, during program execution to achieve the functionality.\n  \nThe transactions can be used to generate statistics and visualisations for total end to end latency, and latency between each pair of probes.\n  \nLet's look at, how profile info and instrumentation work together to build transactions, in the above program.\n  \nThe method ```life()``` is instrumented at three places.\n\n  1. XPEDITE_TXN_SCOPE(Life)\n  2. XPEDITE_PROBE(SleepBegin)\n  3. XPEDITE_PROBE(CodeBegin)\n\nEach instantiation of ```XPEDITE_PROBE``` inserts one named probe (first argument) in the control flow.\n\nProbe scope macros like ```XPEDITE_TXN_SCOPE``` however, inserts a pair of named probe. The first probe is inserted at the point of instrumentation, while the second one at the end of block scope(RAII).\n  \nWith the above instrumentation in place, each iteration of the loop in ```life()```, will hit the following 4 probes in order.\n\n  1. LifeBegin\n  2. SleepBegin\n  3. CodeBegin\n  4. LifeEnd\n  \nSince we are interested in measuring loop iteration latency, the route ```LifeBegin -\u003e SleepBegin -\u003e CodeBegin -\u003e LifeEnd``` is used to build transactions.\nWe expect to see 100 transactions, one per iteration of the loop.\n  \nThe txns() command obliges by returning the table below with 100 rows (one row per transaction).\nThe columns in the table, matches the expected route, with control flowing from left to right.\nEach cell in a row, shows the latency for execution of, code between the probe for the current column and the next one.\n  \nFrom the table, Let's see, how one can infer latency stats for a transaction. Looking at the first row, we can see the below stats for the first transaction\n  \n  1. Total time             - 618.572 us\n  2. Time spent in eat()    - 615.559 us\n  3. Time spent in sleep()  -   1.670 us\n  4. Time spent in code()   -   1.343 us\n\n![alt text](docs/images/reportShellTxns.png \"Xpedite report - Home Page\")\n\nNow let's do something interesting. Instead of listing all transactions, let's just list outliers (transactions, that took longer than expected)\n  \nThe command ```filter(lambda txn: txn.duration \u003e 10)``` filters and lists transactions that took more than 10 us.\nIn my profile, the shell returned just one transaction (as shown below). \n\nReal production systems generate millions of transactions. This command will be quite handy to filter transactions of interest, based on arbitrary criteria.\n\n![alt text](docs/images/reportShellFilter.png \"Xpedite report - Home Page\")\n\nEnough of tables, Let's try some visualisation.\n  \n```plot()``` is useful to plot, latencies of all transactions in chronological order, with transaction id in x-axis and time (us) in y-axis.\n  \nPlots like these are useful, to spot patterns or trends in transaction latency over a period of time.\nThese patterns can provide interesting insights on application behaviour,  bottlenecks, impact of system level events like periodic interrupts etc.\n  \nIn the chart below, it can be seen that, the outlier occurs at the beginning of the profile session. Such a pattern is a symptom of warm up issues in application.\n  \nIn addition to total time, plots are generated for each pair of probes in the transaction.\nThe drop down menu at the top can be used to pick plots for different sections of code.\n\n![alt text](docs/images/reportShellPlot.png \"Xpedite report - Home Page\")\n\nIn plots with outliers, It's typical for worst transaction, to dominate the scale of the y-axis, hiding details of normal transactions.\n  \n\nIn the above chart, the first transaction hides details of all other transactions.  Let's filter and plot, only transactions that took less than 10 micro seconds.\n    \nChaining ```filter()``` with ```plot()```  like this ```filter(lambda txn : txn.duration \u003c 10).plot()```, generates a more useful chart, that shows the latency trend across regular transactions.\n\n\n![alt text](docs/images/reportShellFilterPlot.png \"Xpedite report - Home Page\")\n\n```plot()``` is a also useful to plot a single transaction, when supplied with transaction id as arugment.\n\nRunning ```plot(4)```, plots a hierarchical break up of latency statistics for fourth transaction, as shown below.\n\n![alt text](docs/images/reportShellPlotTxn.png \"Xpedite report - Home Page\")\n\nThe commands described above are useful, to query/visualize profile data at trasaction level granularity. Let's now explore, how to generate statistics, for a group of transactions.\n  \nThe ```stats()``` command generates total and probe level statistics, for all the transactions in the profile.\n  \nFor the above program, running ```stats()``` renders a table with 4 rows. The statistics for end to end transaction latency is show at the last row.\nThe rest of rows provide statistics for different sections of code, between pair of probes, for all probes in the transaction.\n\n![alt text](docs/images/reportShellStat.png \"Xpedite report - Home Page\")\n\nThe max, 95%, 99% and standard deviation in the above table are skewed by inclusion of the huge outlier.\nLet's see how to generate statistics, for a subset of transactions.\n  \nCombining ```filter()``` and ```stats()``` like this ```filter(lambda txn : txn.txnId \u003e 50).stats()``` computes statistics for, all but the first 50 transactions.\nSince the outlier is not included, we can see tighter latency statistics, compared to the previous table.\n\n![alt text](docs/images/reportShellFilterStat.png \"Xpedite report - Home Page\")\n\n## Benchmarking \u003ca name=\"benchmarking\"\u003e\u003c/a\u003e\n\nBenchmarking is useful to compare performance statistics of different runs/builds side by side, to answers questions like the ones stated below.\n\n  1. How effective is an optimisation compared to the unoptimised build ?\n  2. Is the application running faster in next generation processor ? How much speed up or degradation ?\n\nBenchmarks are created by serializing and storing profile data in some file system path, for future comparisons.\nBenchmarks are also useful, in keeping a chronicle of all optimisations, implemented over a period of time.\n  \nLet's consider how to create a new benchmark, for a run of the above described application.\n\nFirst pick file system path (to store the benchmark) and append a meaningful benchmark name to it.\nThe chosen path can then be used at profile creation time like this ```xpedite record -b``` | ```xpedite record --createBenchmark```), to create a new benchmark.\n  \nFor instance, Running ```xpedite record``` as described in the previous section, with an additional parameter ```-b /tmp/baseline``` creates a new benchmark at ```/tmp/baseline```\n  \nHaving created a baseline benchmark, Let's now consider how to use ```/tmp/baseline```, for comparison with a new run.\n  \nA profile can be compared with one or more benchmarks, by providing a list of paths in ```benchmarkPaths``` parameter of profileInfo.\nLet's add ```/tmp/baseline``` to the list of ```benchmarkPaths```.\n\n```python\n  from xpedite import Probe, TxnBeginProbe, TxnEndProbe\n  appName = 'Life'                      # Name of the application\n  appHost = '127.0.0.1'                 # Host, where the application is running\n  appInfo = '/tmp/xpedite-appinfo.txt'  # Path of the appinfo file\n\n  probes = [\n    TxnBeginProbe('Life Begin', sysName = 'LifeBegin'),     # Txn begins with probe 'LifeBegin', marks the beginning of 'eat()'\n    Probe('Code Begin', sysName = 'CodeBegin'),             # Marks the beginning of 'code()'\n    Probe('Sleep Begin', sysName = 'SleepBegin'),           # Marks the beginning of 'sleep()'\n    TxnEndProbe('Life End', sysName = 'LifeEnd'),           # Txn ends with probe 'LifeEnd', marks the end of 'sleep()'\n  ]\n\n  benchmarkPaths = [\n    '/tmp/baseline'\n  ]\n```\n\nRecording a new profile with the updated profileInfo, will benchmark statistics for current run with baseline.\nAs shown below, the histogram for the new run, includes latency distribution of the baseline, in addition to the current run.\n\n![alt text](docs/images/reportShellBenchmarkHistogram.png \"Xpedite report - Benchmark Histogram\")\n\nSide by side comparisons, are also generated in ```plot()``` results. \n\n![alt text](docs/images/reportShellBenchmarkFilterPlot.png \"Xpedite report - Benchmark filter plot\")\n\nThe ```stat()``` command goes one step further and generates side by side comparisons for each section of code (each table cell).\nThe stats for the current run are highlighted with reference to a benchmark. The improvements are highlighted with Green color and degradations in Red.\nThe difference between the current run and a benchmark is also give within parenthesis.\n\nIn reports with multiple benchmarks, the reference benchmark can be change by using the drop down at the top of the table.\n\n![alt text](docs/images/reportShellBenchmarkStat.png \"Xpedite report - Benchmark statistics\")\n\n\n## Hardware performance counters \u003ca name=\"pmc\"\u003e\u003c/a\u003e\n\nXpedite uses linux perf events api to program and collect hardware performance counters in the cpu.\nTo enable this feature, ensure cpu level event access is permitted for the current user running Xpedite.\nCpu level event access for all users can be enabled by setting /proc/sys/kernel/perf_event_paranoid to a value \u003c=0.\nMore details can found at perf_event_paranoid section of the linux kernel documentation [here](https://www.kernel.org/doc/Documentation/sysctl/kernel.txt)\n\nXpedite can collect any of the core and offcore hardware performance counters, for bottleneck or topdown analysis.\nThe list of counters supported, depends on processor's micro architecture and can be listed by running ```xpedite list```\n  \nFor instance, running ```xpedite list``` in my IVY Bridge server box, lists 355 pmc events as shown below.\n\n```\nFP_COMP_OPS_EXE.SSE_PACKED_SINGLE                            [0x00004010]      - Number of SSE* or AVX-128 FP Computational packed single-precision uops issued this cycle\nUOPS_DISPATCHED_PORT.PORT_0_CORE                             [0x002001A1]      - Cycles per core when uops are dispatched to port 0\nINST_RETIRED.ANY_P                                           [0x000000C0]      - Number of instructions retired. General Counter   - architectural event\nOFFCORE_RESPONSE.PF_LLC_DATA_RD.LLC_HIT.SNOOP_MISS           [0xB7,0xBB|0x01] - Counts prefetch (that bring data to LLC only) data reads that hit in the LLC and the snoops sent to sibling cores return clean response\n...\n```\n\n\nLet's consider, how to configure xpedite to program and collect data from hardware performance counters. \n\nUp to 8 pmc events can be programmed in modern Intel processors (Sandy Bridge and later), when hyper threading is disabled.\n  \nTo get consistent results, threads under profile, must be pinned to one of the cpu cores, to prevent cpu migrations.\n  \nThe list of pmc events and cpu cores (where the threads are pinned), can be configured in profileInfo as shown below.\n\n```python\nfrom xpedite import TopdownNode, Metric, Event\n\n# List of performance counters to be collected for this profile\npmc = [\n  TopdownNode('Root'),                                      # Top down analysis for Root node of the hierarchy\n  Metric('IPC'),                                            # Instructions retired per cycle\n  Event('kernel cycles',  'CPL_CYCLES.RING0'),              # Cycles spent in the kernel\n  Event('Data L1 Miss',   'L2_RQSTS.DEMAND_DATA_RD_HIT'),   # Demand Data Read requests that hit L2 cache\n  Event('Data L2 Miss',   'L2_RQSTS.DEMAND_DATA_RD_MISS'),  # Demand Data Read requests that miss L2 cache\n]\n\n# List of cpu, where the hardware performance counters will be enabled\ncpuSet = [5]\n```\n\nThe profileInfo (shown above), enables pmc collection at cpu core 5 ```cpuSet = [5]```. \n  \nFirst start the target app (described above), using ```taskset -c 5 ./life``` to pin the main thread to core 5.\n  \nOnce the app is running, Run ```xpedite record``` to attach the profiler and generate report with performance counter data.\n\nThe images below show, how xpedite reports are enriched, with data collected from hardware performance counters.\nThe overall structure of the report remains the same (as described in previous sections), with more details presented, where appropriate.\n\n```txns()``` command generates transactions in a table, similar to the one shown above.\nHowever doing mouse over on a table cell, reveals a popup showing performance counter data, for that section of code.\n\n![alt text](docs/images/reportShellTxnsPmc.png \"Xpedite command - transaction list with performance counter data\")\n\n```stat()``` command generates a statistics table for each of the chosen counter, in addition to default timings statistics.\nThe tabs at the top of the table can be used to select stats for different counters.\n\n![alt text](docs/images/reportShellStatPmc.png \"Xpedite command - pmc statistics\")\n\nPlotting a transaction, generates a new visualisation showing the correlation between performance counters and sections of code.\nFor instance, plotting 4th transaction in my profile, generates the bipartite visualisation below.\nThe counters (each with a distinct color) is shown at the top, while the probes (sections of code) at the bottom.\n\n![alt text](docs/images/reportShellPlotTxnPmc.png \"Xpedite report - plot txn pmc\")\n\nMoving mouse over a probe name, shows values of all the counters for that section of code.\n\n![alt text](docs/images/reportShellPlotTxnPmcOnProbe.png \"Xpedite report - plot txn pmc on probe\")\n\nAlternatively, moving mouse over a counter name, shows the value of that counter at different sections of code.\n\n![alt text](docs/images/reportShellPlotTxnPmcOnCounter.png \"Xpedite report - plot txn pmc on counter\")\n\n\n## Cycle Accounting \u003ca name=\"cycleAccounting\"\u003e\u003c/a\u003e\n\nXpedite supports cycle accounting and bottleneck analysis using \n[top-down micro architecture analysis methodology](https://ieeexplore.ieee.org/document/6844459).\n  \nThe topdown hierarchy for any micro architecture can be rendered to console with ```xpedite topdown``` command.\nNodes that need more than 8 general purpose counters are not supported and highlighted in Red.\n  \nThe rest of the nodes, can be used directly with ```pmc``` parameter in profileInfo.\nFor instance, ```TopdownNode('Root')``` will provide a breakup of the following immediate children.\n  \n  1. Retiring\n  2. BadSpeculation\n  3. FrontendBound\n  4. BackendBound\n\nGiven a topdown node, xpedite automatically resolves and enables the required performance counters, and generates statistics for topdown nodes in addition to performance counters.\n\n![alt text](docs/images/cmdTopdown.png \"Xpedite command - topdown\")\n\nXpedite also supports a few predefined metrics like IPC, CPI etc..  The list of supported metrics can be displayed with ```xpedite metrics``` command.\n\n## Collaboration \u003ca name=\"colloboration\"\u003e\u003c/a\u003e\n\nXpedite facilitates collaboration, by making it easy for developers to share, profile and analyze results with other developers.\n  \nThe jupyter notebook (along with transaction data and visualisations) can be bundled into one tar file (```.tar.xp```) using ```xpedite shell zip``` command.\n\nOn the other hand, a xpedite tar file can be reopened using ```xpedite shell unzip``` command, with file name as parameter.\n\n\n## Quality of life features \u003ca name=\"qolFeatures\"\u003e\u003c/a\u003e\n\n#### Auto generating profileInfo module\n\nFor applications instrumented with hundreds of probes, it can become a chore to hand code all the probes in the profileInfo module. \n  \nXpedite provides a handy command ```xpedite generate```, that can locating probes in an instrumented process, to generate the profileInfo module.\n\nFor instance, a profileInfo for the program (described above),  can be generated using command ```xpedite generate -a /tmp/xpedite-appinfo.txt```.\n  \nThe ```profileInfo.py``` module is created in the current working directory.  The comments in the genreated file are also useful, as a documenation for various profileInfo parameters.\n\n#### Snippets for commonly used commands\n\nXpedite shell generates a drop down at top right corner (shown below), with commonly used shell commands.\nThis is quite handy, if you can't recollect a command or the exact syntax for using one.\nClicking on an option, adds a new cell, with the command in place ready for execution.\n\n![alt text](docs/images/reportShellSnippets.png \"Xpedite report - Snippets\")\n\n#### Profiling from a remote host\n\nFor production systems, it is a good practice to run xpedite in a host different from the production host, to avoid any potential interference.\n  \nXpedite supports profiling applications running in any remote host, as long as a tcp connectivity is permitted.\nRemote profiling can be enabled, by setting the appropriate remote host name in ```appHost``` parameter of the profileInfo module.\n\n## Support \u003ca name=\"support\"\u003e\u003c/a\u003e\nNeed Help ? Get in touch with Xpedite developers via email - msperf@morganstanley.com\n\n## Acknowledgements \u003ca name=\"acknowledgements\"\u003e\u003c/a\u003e\n\nXpedite was envisioned and developed by **[Manikandan Dhamodharan](http://www.linkedin.com/in/mani-d)**.\n  \n\nThanks to **[Brooke Elizabeth Cantwell](https://www.linkedin.com/in/brookecantwell)**, **[Dhruv Shekhawat](http://www.linkedin.com/in/dhruvshekhawat)** for jupyter integration and test cases.\n  \n\nSpecial thanks to **Dileep Perchani** and **Kevin Elliott** for leading Xpedite open source initiative.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmorganstanley%2Fxpedite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmorganstanley%2Fxpedite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmorganstanley%2Fxpedite/lists"}