{"id":18073753,"url":"https://github.com/slmt/google-load-parser","last_synced_at":"2025-10-12T03:17:50.418Z","repository":{"id":85640320,"uuid":"178815901","full_name":"SLMT/google-load-parser","owner":"SLMT","description":"A parser to extract the CPU usage timeline from Google's cluster-usage trace data.","archived":false,"fork":false,"pushed_at":"2022-09-24T09:28:56.000Z","size":15,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-08-25T11:59:42.092Z","etag":null,"topics":["cluster","google","monitoring","parser","resources"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SLMT.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-01T08:11:13.000Z","updated_at":"2022-09-24T09:28:58.000Z","dependencies_parsed_at":null,"dependency_job_id":"ba0903a8-1f6d-46db-b0a6-f8de06165ba3","html_url":"https://github.com/SLMT/google-load-parser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/SLMT/google-load-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SLMT%2Fgoogle-load-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SLMT%2Fgoogle-load-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SLMT%2Fgoogle-load-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SLMT%2Fgoogle-load-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SLMT","download_url":"https://codeload.github.com/SLMT/google-load-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SLMT%2Fgoogle-load-parser/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279010175,"owners_count":26084691,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cluster","google","monitoring","parser","resources"],"created_at":"2024-10-31T10:09:11.088Z","updated_at":"2025-10-12T03:17:50.402Z","avatar_url":"https://github.com/SLMT.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Google Load Parser\n\nThis program is designed to parse and transfer the resource usage files downloaded from [Google cluster-usage traces][1] into the timeline sequences of CPU usage for each machine. Note that we only need the files in `task_usage` directory of the trace.\n\n## Requirements\n\nIn order to build the source, you need `cargo`, which can be downloaded via [rustup][2], the Rust toolchain installer.\n\n## Building\n\nUse the following command to build the program. We recommend to build the program in `RELEASE` mode, which makes the program run much faster.\n\n```bash\n\u003e cargo build --release\n```\n\nThen, you will find the executable `google-load-parser.exe` in `target/release`.\n\n## Usage\n\n```txt\ngoogle-load-parser v1.1.0\nSLMT \u003csam123456777@gmail.com\u003e\nParse the load trace of Google's testing cluster\n\nUSAGE:\n    google-load-parser.exe [SUBCOMMAND]\n\nFLAGS:\n    -h, --help       Prints help information\n    -V, --version    Prints version information\n\nSUBCOMMANDS:\n    help        Prints this message or the help of the given subcommand(s)\n    transfer    Transfers the files in the given directory to daily timeline files\n    trim        Trims the files in the given directory to leave only necessary data\n```\n\nYou can add `RUST_LOG=DEBUG` in front of the command to show debugging information.\n\n```bash\n\u003e RUST_LOG=DEBUG google-load-parser.exe [SUBCOMMAND]\n```\n\nThis program currently supports two sub commands:\n\n- `trim`\n- `transfer`\n\n### Trimming\n\n```txt\nTrims the files in the given directory to leave only necessary data\n\nUSAGE:\n    google-load-parser.exe trim \u003cINPUT DIR\u003e \u003cOUTPUT DIR\u003e\n\nFLAGS:\n    -h, --help       Prints help information\n    -V, --version    Prints version information\n\nARGS:\n    \u003cINPUT DIR\u003e     the directory containing input files\n    \u003cOUTPUT DIR\u003e    the directory for placing output files\n```\n\nSince the downloaded files are compressed as `gz` files, we need to decompress the files before processing them. This command decompresses the `gz` files in the given `[INPUT DIR]` directory and trims unnecessary information from the files such that the output files in `[OUTPUT DIR]` have only the necessary information we need.\n\nAn example of the content of one of decompressed file:\n\n```csv\n5612000000,5700000000,4665712499,369,4820204869,0.03143,0.05389,0.06946,0.005997,0.006645,0.05408,7.629e-05,0.0003834,0.2415,0.002571,2.911,,0,0,0.02457\n5612000000,5700000000,4665712499,369,4820204869,0.03143,0.05389,0.06946,0.005997,0.006645,0.05408,7.629e-05,0.0003834,0.2415,0.002571,2.911,,0,0,0.02457\n5612000000,5700000000,4665712499,798,3349189123,0.02698,0.06714,0.07715,0.004219,0.004868,0.06726,7.915e-05,0.0003681,0.27,0.00293,3.285,0.008261,0,0,0.01608\n...\n```\n\nAfter performing the trimming:\n\n```csv\n5612000000,5700000000,3349189123,0.02698\n5612000000,5700000000,372630265,0.04114\n5612000000,5700000000,1437119225,0.07275\n...\n```\n\nThis can greatly reduces the size of files.\n\n### Transferring\n\n```txt\nTransfers the files in the given directory to daily timeline files\n\nUSAGE:\n    google-load-parser.exe transfer \u003cINPUT DIR\u003e \u003cOUTPUT DIR\u003e [SLOT LENGTH]\n\nFLAGS:\n    -h, --help       Prints help information\n    -V, --version    Prints version information\n\nARGS:\n    \u003cINPUT DIR\u003e      the directory containing input files\n    \u003cOUTPUT DIR\u003e     the directory for placing output files\n    \u003cSLOT LENGTH\u003e    the length of time slot (in seconds) [default: 60]\n```\n\nThis command processes the trimmed files in the given `[INPUT DIR]` directory into the daily timeline files, where each row represents the changing of CPU usage of a machine in the form of timeline in a day. You can specify the length of a time slot in seconds, which defines the sample rate of the timeline.\n\nNote that the program maps machine ids from its original domain into [1~15000] in order to save the size of the files.\n\n## License\n\nMIT\n\n[1]: https://github.com/google/cluster-data\n[2]: https://rustup.rs/","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslmt%2Fgoogle-load-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fslmt%2Fgoogle-load-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslmt%2Fgoogle-load-parser/lists"}