{"id":13581673,"url":"https://github.com/google/schedviz","last_synced_at":"2025-04-06T10:32:48.853Z","repository":{"id":35191757,"uuid":"198704199","full_name":"google/schedviz","owner":"google","description":"A tool for gathering and visualizing kernel scheduling traces on Linux machines","archived":false,"fork":false,"pushed_at":"2024-06-11T19:22:55.000Z","size":4681,"stargazers_count":524,"open_issues_count":20,"forks_count":34,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-22T00:42:35.983Z","etag":null,"topics":["kernel","scheduling","tracing","visualization"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-24T20:18:11.000Z","updated_at":"2025-02-27T19:14:52.000Z","dependencies_parsed_at":"2024-06-18T21:24:09.084Z","dependency_job_id":null,"html_url":"https://github.com/google/schedviz","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fschedviz","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fschedviz/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fschedviz/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fschedviz/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google","download_url":"https://codeload.github.com/google/schedviz/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247470449,"owners_count":20944146,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kernel","scheduling","tracing","visualization"],"created_at":"2024-08-01T15:02:10.103Z","updated_at":"2025-04-06T10:32:43.834Z","avatar_url":"https://github.com/google.png","language":"Go","readme":"\n# Linux Scheduling Visualization\n\n![Build and Test SchedViz](https://github.com/google/schedviz/workflows/Build%20and%20Test%20SchedViz/badge.svg)\n\nThis is not an officially supported Google product.\n\nSchedViz is a tool for gathering and visualizing kernel scheduling traces on\nLinux machines. It helps to:\n\n*   quantify task starvation due to round-robin queueing,\n*   identify primary antagonists stealing work from critical threads,\n*   determine when core allocation choices yield unnecessary waiting,\n*   evaluate different scheduling policies,\n*   and much more.\n\nTo learn more about what SchedViz can do and what problems it can solve, read\nour\n[announcement post](https://opensource.googleblog.com/2019/10/understanding-scheduling-behavior-with.html)\non the Google open source blog.\n\n## Running SchedViz\n\nTo get started clone this repo:\n\n```bash\ngit clone https://github.com/google/schedviz.git\n```\n\nSchedViz requires *yarn*. Closely follow the installation instructions on the\n[yarn Website](https://www.yarnpkg.com). Make sure you have a recent version of\nNode.js (\u003e= 10.9.0) installed as well.\n\nBuilding SchedViz also requires the GNU build tools and the unzip utility. On\nDebian, for example, the dependencies can be installed by executing this\ncommand:\n\n```bash\nsudo apt-get update \u0026\u0026 sudo apt-get install build-essential unzip\n```\n\nTo run SchedViz, run the following commands:\n\n```bash\ncd schedviz # The location where the repo was cloned\nyarn install\n```\n\nOnce yarn has finished run the following command in the root of the repo to\nstart the server:\n\n```bash\nyarn bazel run server -- -- -storage_path=\"Path to a folder to store traces in\"\n```\n\nThe server binary takes several options:\n\n| Name           | Type   | Description                                                                                                                                                                           |\n| -------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| storage_path   | String | Required.\u003cbr\u003eThe folder where trace data is/will be stored.\u003cbr\u003eThis should be an empty folder that is not used by anything else.                                                                                                                           |\n| cache_size     | Int    | Optional.\u003cbr\u003eThe maximum number of collections to keep open in memory at once.                                                                                                        |\n| port           | Int    | Optional.\u003cbr\u003eThe port to run the server on.\u003cbr\u003eDefaults to 7402.                                                                                                                      |\n| resources_root | String | Optional.\u003cbr\u003eThe folder where the static files (e.g. HTML and JavaScript) are stored.\u003cbr\u003eDefault is \"client\".\u003cbr\u003eIf using bazel to run the server, you shouldn't need to change this. |\n\nTo load SchedViz, go to http://localhost:7402/collections\n\n## Manually collecting a scheduling trace\n\n1.  Run the [trace.sh](util/trace.sh) as root on the machine that you want to\n    collect a trace from:\n\n    ```bash\n    sudo ./trace.sh -out 'Path to directory to save trace in' \\\n                    -capture_seconds 'Number of seconds to record a trace' \\\n                    [-buffer_size 'Size of the trace buffer in KB'] \\\n                    [-copy_timeout 'Time to wait for copying to finish']\n    ```\n\n    Default number of seconds to wait for copy to finish is `5`. The copying of\n    the raw traces from Ftrace to the output file won't finish automatically as\n    the raw trace pipes aren't closed. This setting is a timeout for when the\n    copying should stop. It should be at least as long as it takes to copy out\n    all the trace files. If you see incomplete traces, try increasing the\n    timeout.\n\n    Default buffer size is `4096 KB`.\n\n    The shell script collects the `sched_switch`, `sched_wakeup`,\n    `sched_wakeup_new`, and `sched_migrate_task` tracepoints.\n\n    \u003e NOTE: There is also a binary version of the trace collector script, which\n    \u003e can collect traces larger than the size of the buffer.\n    \u003e\n    \u003e To build it, run `bazel build util:trace` from the root of the repo.\n    \u003e\n    \u003e To run it, run `sudo bazel-bin/util/trace`. It takes the same arguments as\n    \u003e the shell script, except -copy_timeout.\n\n2.  Copy the generated tar.gz file off of the trace machine to a machine that\n    can access the SchedViz UI.\n\n3.  Upload the output tar.gz file to SchedViz by clicking the \"Upload Trace\"\n    button on the SchedViz collections page. The tar.gz file will be located at\n    `$OUTPUT_PATH/trace.tar.gz`.\n\n## Collecting a scheduling trace on a GCE machine\n\nUsing [gcloud](https://cloud.google.com/sdk/gcloud/) you can easily collect a\ntrace on a remote GCE machine. We've written a\n[helper script](util/gcloud_trace.sh) to collect a trace on a GCE machine with a\nsingle command. This script is a wrapper around the manual\n[trace script](util/trace.sh).\n\nUsage: `bash ./gcloud_trace.sh -instance 'GCE Instance Name' \\ -trace_args\n'Arguments to forward to trace script' \\ [-project 'GCP Project Name'] \\ [-zone\n'GCP Project Zone'] \\ [-script 'Path to trace script']`\n\n## Analyzing a trace\n\nTake a look at our [features and usage walkthrough](doc/walkthrough.md).\n\n## Keyboard Shortcuts\n\nKey                 | Description\n------------------- | ----------------------------------------------\n`?` (`Shift` + `/`) | Show a shortcut cheatsheet dialog\n`Shift` + `c`       | Copy the current tooltip text to the clipboard\n`Shift` + `a`       | Reset zoom level in heatmap viewport\n`Shift` + `x`       | Clear the CPU filter\n\n## Common sources of errors\n\n### Errors collecting traces\n\n*   When the trace takes longer to copy than the timeout passed to the trace\n    collection script (default timeout is five seconds), traces will not be\n    fully copied into the output file. To fix this, increase the copy timeout\n    parameter to a larger and sufficient value.\n\n*   If a CPU's trace buffer fills up before the timeout is reached, recording\n    will be stopped for that CPU. Other CPUs may keep recording until their\n    buffers fill up, but only the events occurring up to the last event in the\n    first buffer to fill up will be used.\n\n### Errors loading traces\n\nSchedViz infers CPU and thread state information that isn't directly attested in\nthe trace, and fails aggressively when this inference does not succeed. In\nparticular, two factors may increase the likelihood of inference failures: * PID\nreuse. Machines with small PID address spaces, or long traces, may experience\nPID reuse, where two separate threads are assigned the same PID. This can lead\nto inference failures, for example, threads last seen on one CPU could appear on\nanother without an intervening migration. * Event ordering. Scheduling trace\ninterpretation relies on an accurate total ordering between the scheduling\nevents; event timestamps are generated by their emitting CPU's rdtsc. If the\ncores of a machine do not have their rdtscs tightly aligned, events may appear\nto slip, which can lead to inference errors as above.\n\nEvents can also have incorrect timestamps written if the kernel is interrupted\nwhile it is recording an event, and the interrupt tries to write another event.\nThis will result in many events appearing having the same timestamp when they\nshouldn't. This type of error occurs when recording high-frequency events such\nas the workqueue family, and it very rarely occurs when recording only\nscheduling events.\n","funding_links":[],"categories":["Security Tools","Go","Tools"],"sub_categories":["Interfaces","Winetricks","Objective-C Tools, Libraries, and Frameworks","Mesh networks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fschedviz","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle%2Fschedviz","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fschedviz/lists"}