{"id":29105220,"url":"https://github.com/amd/node-scraper","last_synced_at":"2026-02-02T20:12:05.314Z","repository":{"id":301173721,"uuid":"996892244","full_name":"amd/node-scraper","owner":"amd","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-25T13:43:19.000Z","size":399,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"development","last_synced_at":"2025-06-25T14:43:41.441Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-05T16:12:11.000Z","updated_at":"2025-06-25T13:43:23.000Z","dependencies_parsed_at":"2025-06-25T14:44:02.680Z","dependency_job_id":"69a48d6e-5472-457e-a24e-b1f332ae76fe","html_url":"https://github.com/amd/node-scraper","commit_stats":null,"previous_names":["amd/node-scraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amd/node-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amd%2Fnode-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amd%2Fnode-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amd%2Fnode-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amd%2Fnode-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amd","download_url":"https://codeload.github.com/amd/node-scraper/tar.gz/refs/heads/development","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amd%2Fnode-scraper/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262518514,"owners_count":23323341,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-29T01:08:05.146Z","updated_at":"2026-02-02T20:12:05.308Z","avatar_url":"https://github.com/amd.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Node Scraper\nNode Scraper is a tool which performs automated data collection and analysis for the purposes of\nsystem debug.\n\n## Table of Contents\n- [Installation](#installation)\n  - [Install From Source](#install-from-source)\n- [CLI Usage](#cli-usage)\n  - [Execution Methods](#execution-methods)\n    - [Example: Remote Execution](#example-remote-execution)\n    - [Example: connection_config.json](#example-connection_configjson)\n  - [Subcommands](#subcommands)\n    - ['describe' subcommand](#describe-subcommand)\n    - ['run-plugins' sub command](#run-plugins-sub-command)\n    - ['gen-plugin-config' sub command](#gen-plugin-config-sub-command)\n    - ['summary' sub command](#summary-sub-command)\n- [Configs](#configs)\n  - [Global args](#global-args)\n  - [Plugin config: `--plugin-configs` command](#plugin-config---plugin-configs-command)\n  - [Reference config: `gen-reference-config` command](#reference-config-gen-reference-config-command)\n- **Extending Node Scraper (integration \u0026 external plugins)** → See [EXTENDING.md](EXTENDING.md)\n- **Full view of the plugins with the associated collectors \u0026 analyzers as well as the commands\ninvoked by collectors** -\u003e See [docs/PLUGIN_DOC.md](docs/PLUGIN_DOC.md)\n\n## Installation\n### Install From Source\nNode Scraper requires Python 3.9+ for installation. After cloning this repository,\ncall dev-setup.sh script with 'source'. This script creates an editable install of Node Scraper in\na python virtual environment and also configures the pre-commit hooks for the project.\n\n```sh\nsource dev-setup.sh\n```\n\nAlternatively, follow these manual steps:\n\n### 1. Virtual Environment (Optional)\n```sh\npython3 -m venv venv\nsource venv/bin/activate\n```\nOn Debian/Ubuntu, you may need: `sudo apt install python3-venv`\n\n### 2. Install from Source (Required)\n```sh\npython3 -m pip install --editable .[dev] --upgrade\n```\nThis installs Node Scraper in editable mode with development dependencies. To verify: `node-scraper --help`\n\n### 3. Git Hooks (Optional)\n```sh\npre-commit install\n```\nSets up pre-commit hooks for code quality checks. On Debian/Ubuntu, you may need: `sudo apt install pre-commit`\n\n## CLI Usage\nThe Node Scraper CLI can be used to run Node Scraper plugins on a target system. The following CLI\noptions are available:\n\n```sh\nusage: node-scraper [-h] [--sys-name STRING] [--sys-location {LOCAL,REMOTE}] [--sys-interaction-level {PASSIVE,INTERACTIVE,DISRUPTIVE}] [--sys-sku STRING]\n                    [--sys-platform STRING] [--plugin-configs [STRING ...]] [--system-config STRING] [--connection-config STRING] [--log-path STRING]\n                    [--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}] [--gen-reference-config] [--skip-sudo]\n                    {summary,run-plugins,describe,gen-plugin-config} ...\n\nnode scraper CLI\n\npositional arguments:\n  {summary,run-plugins,describe,gen-plugin-config}\n                        Subcommands\n    summary             Generates summary csv file\n    run-plugins         Run a series of plugins\n    describe            Display details on a built-in config or plugin\n    gen-plugin-config   Generate a config for a plugin or list of plugins\n\noptions:\n  -h, --help            show this help message and exit\n  --sys-name STRING     System name (default: \u003cmy_system_name\u003e)\n  --sys-location {LOCAL,REMOTE}\n                        Location of target system (default: LOCAL)\n  --sys-interaction-level {PASSIVE,INTERACTIVE,DISRUPTIVE}\n                        Specify system interaction level, used to determine the type of actions that plugins can perform (default: INTERACTIVE)\n  --sys-sku STRING      Manually specify SKU of system (default: None)\n  --sys-platform STRING\n                        Specify system platform (default: None)\n  --plugin-configs [STRING ...]\n                        built-in config names or paths to plugin config JSONs. Available built-in configs: NodeStatus (default: None)\n  --system-config STRING\n                        Path to system config json (default: None)\n  --connection-config STRING\n                        Path to connection config json (default: None)\n  --log-path STRING     Specifies local path for node scraper logs, use 'None' to disable logging (default: .)\n  --log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}\n                        Change python log level (default: INFO)\n  --gen-reference-config\n                        Generate reference config from system. Writes to ./reference_config.json. (default: False)\n  --skip-sudo           Skip plugins that require sudo permissions (default: False)\n\n```\n\n### Execution Methods\n\nNode Scraper can operate in two modes: LOCAL and REMOTE, determined by the `--sys-location` argument.\n\n- **LOCAL** (default): Node Scraper is installed and run directly on the target system. All data collection and plugin execution occur locally.\n- **REMOTE**: Node Scraper runs on your local machine but targets a remote system over SSH. In this mode, Node Scraper does not need to be installed on the remote system; all commands are executed remotely via SSH.\n\nTo use remote execution, specify `--sys-location REMOTE` and provide a connection configuration file with `--connection-config`.\n\n#### Example: Remote Execution\n\n```sh\nnode-scraper --sys-name \u003cremote_host\u003e --sys-location REMOTE --connection-config ./connection_config.json run-plugins DmesgPlugin\n```\n\n##### Example: connection_config.json\n\n```json\n{\n    \"InBandConnectionManager\": {\n        \"hostname\": \"remote_host.example.com\",\n        \"port\": 22,\n        \"username\": \"myuser\",\n        \"password\": \"mypassword\",\n        \"key_filename\": \"/path/to/private/key\"\n    }\n}\n```\n\n**Notes:**\n- If using SSH keys, specify `key_filename` instead of `password`.\n- The remote user must have permissions to run the requested plugins and access required files. If needed, use the `--skip-sudo` argument to skip plugins requiring sudo.\n\n### Subcommands\n\nPlugins to run can be specified in two ways, using a plugin JSON config file or using the\n'run-plugins' sub command. These two options are not mutually exclusive and can be used together.\n\n#### **'describe' subcommand**\n\nYou can use the `describe` subcommand to display details about built-in configs or plugins.\nList all built-in configs:\n```sh\nnode-scraper describe config\n```\n\nShow details for a specific built-in config\n```sh\nnode-scraper describe config \u003cconfig-name\u003e\n```\n\nList all available plugins**\n```sh\nnode-scraper describe plugin\n```\n\nShow details for a specific plugin\n```sh\nnode-scraper describe plugin \u003cplugin-name\u003e\n```\n\n#### **'run-plugins' sub command**\nThe plugins to run and their associated arguments can also be specified directly on the CLI using\nthe 'run-plugins' sub-command. Using this sub-command you can specify a plugin name followed by\nthe arguments for that particular plugin. Multiple plugins can be specified at once.\n\nYou can view the available arguments for a particular plugin by running\n`node-scraper run-plugins \u003cplugin-name\u003e -h`:\n```sh\nusage: node-scraper run-plugins BiosPlugin [-h] [--collection {True,False}] [--analysis {True,False}] [--system-interaction-level STRING]\n                                            [--data STRING] [--exp-bios-version [STRING ...]] [--regex-match {True,False}]\n\noptions:\n  -h, --help            show this help message and exit\n  --collection {True,False}\n  --analysis {True,False}\n  --system-interaction-level STRING\n  --data STRING\n  --exp-bios-version [STRING ...]\n  --regex-match {True,False}\n\n```\n\nExamples\n\nRun a single plugin\n```sh\nnode-scraper run-plugins BiosPlugin --exp-bios-version TestBios123\n```\n\nRun multiple plugins\n```sh\nnode-scraper run-plugins BiosPlugin --exp-bios-version TestBios123 RocmPlugin --exp-rocm TestRocm123\n```\n\nRun plugins without specifying args (plugin defaults will be used)\n\n```sh\nnode-scraper run-plugins BiosPlugin RocmPlugin\n```\n\nUse plugin configs and 'run-plugins'\n\n```sh\nnode-scraper run-plugins BiosPlugin\n```\n\n#### **'gen-plugin-config' sub command**\nThe 'gen-plugin-config' sub command can be used to generate a plugin config JSON file for a plugin\nor list of plugins that can then be customized. Plugin arguments which have default values will be\nprepopulated in the JSON file, arguments without default values will have a value of 'null'.\n\nExamples\n\nGenerate a config for the DmesgPlugin:\n```sh\nnode-scraper gen-plugin-config --plugins DmesgPlugin\n```\n\nThis would produce the following config:\n\n```json\n{\n  \"global_args\": {},\n  \"plugins\": {\n    \"DmesgPlugin\": {\n      \"collection\": true,\n      \"analysis\": true,\n      \"system_interaction_level\": \"INTERACTIVE\",\n      \"data\": null,\n      \"analysis_args\": {\n        \"analysis_range_start\": null,\n        \"analysis_range_end\": null,\n        \"check_unknown_dmesg_errors\": true,\n        \"exclude_category\": null,\n        \"interval_to_collapse_event\": 60,\n        \"num_timestamps\": 3\n      }\n    }\n  },\n  \"result_collators\": {}\n}\n```\n\n**Running DmesgPlugin with a dmesg log file:**\n\nInstead of collecting dmesg from the system, you can analyze a pre-existing dmesg log file using the `--data` argument:\n\n```sh\nnode-scraper --run-plugins DmesgPlugin --data /path/to/dmesg.log --collection False\n```\n\nThis will skip the collection phase and directly analyze the provided dmesg.log file.\n\n**Custom Error Regex Example:**\n\nYou can extend the built-in error detection with custom regex patterns. Create a config file with custom error patterns:\n\n```json\n{\n  \"global_args\": {},\n  \"plugins\": {\n    \"DmesgPlugin\": {\n      \"analysis_args\": {\n        \"check_unknown_dmesg_errors\": false,\n        \"interval_to_collapse_event\": 60,\n        \"num_timestamps\": 3,\n        \"error_regex\": [\n          {\n            \"regex\": \"MY_CUSTOM_ERROR.*\",\n            \"message\": \"My Custom Error Detected\",\n            \"event_category\": \"SW_DRIVER\",\n            \"event_priority\": 3\n          },\n          {\n            \"regex\": \"APPLICATION_CRASH: .*\",\n            \"message\": \"Application Crash\",\n            \"event_category\": \"SW_DRIVER\",\n            \"event_priority\": 4\n          }\n        ]\n      }\n    }\n  },\n  \"result_collators\": {}\n}\n```\n\nSave this to `dmesg_custom_config.json` and run:\n\n```sh\nnode-scraper --plugin-configs dmesg_custom_config.json run-plugins DmesgPlugin\n```\n\n#### **'summary' sub command**\nThe 'summary' subcommand can be used to combine results from multiple runs of node-scraper to a\nsingle summary.csv file. Sample run:\n```sh\nnode-scraper summary --search-path /\u003cpath_to_node-scraper_logs\u003e\n```\nThis will generate a new file '/\u003cpath_to_node-scraper_logs\u003e/summary.csv' file. This file will\ncontain the results from all 'nodescraper.csv' files from '/\u003cpath_to_node-scarper_logs\u003e'.\n\n### Configs\nA plugin JSON config should follow the structure of the plugin config model defined here.\nThe globals field is a dictionary of global key-value pairs; values in globals will be passed to\nany plugin that supports the corresponding key. The plugins field should be a dictionary mapping\nplugin names to sub-dictionaries of plugin arguments. Lastly, the result_collators attribute is\nused to define result collator classes that will be run on the plugin results. By default, the CLI\nadds the TableSummary result collator, which prints a summary of each plugin’s results in a\ntabular format to the console.\n\n```json\n{\n    \"globals_args\": {},\n    \"plugins\": {\n        \"BiosPlugin\": {\n            \"analysis_args\": {\n                \"exp_bios_version\": \"TestBios123\"\n            }\n        },\n        \"RocmPlugin\": {\n            \"analysis_args\": {\n                \"exp_rocm_version\": \"TestRocm123\"\n            }\n        }\n    }\n}\n```\n\n#### Global args\nGlobal args can be used to skip sudo plugins or enable/disble either collection or analysis.\nBelow is an example that skips sudo requiring plugins and disables analysis.\n\n```json\n  \"global_args\": {\n      \"collection_args\": {\n        \"skip_sudo\" : 1\n      },\n      \"collection\" : 1,\n      \"analysis\" : 0\n  },\n```\n\n#### Plugin config: **'--plugin-configs' command**\nA plugin config can be used to compare the system data against the config specifications:\n```sh\nnode-scraper --plugin-configs plugin_config.json\n```\nHere is an example of a comprehensive plugin config that specifies analyzer args for each plugin:\n```json\n{\n  \"global_args\": {},\n  \"plugins\": {\n    \"BiosPlugin\": {\n      \"analysis_args\": {\n        \"exp_bios_version\": \"3.5\"\n      }\n    },\n    \"CmdlinePlugin\": {\n      \"analysis_args\": {\n        \"cmdline\": \"imgurl=test NODE=nodename selinux=0 serial console=ttyS1,115200 console=tty0\",\n        \"required_cmdline\" : \"selinux=0\"\n      }\n    },\n    \"DkmsPlugin\": {\n      \"analysis_args\": {\n        \"dkms_status\": \"amdgpu/6.11\",\n        \"dkms_version\" : \"dkms-3.1\",\n        \"regex_match\" : true\n      }\n    },\n    \"KernelPlugin\": {\n      \"analysis_args\": {\n        \"exp_kernel\": \"5.11-generic\"\n      }\n    },\n    \"OsPlugin\": {\n      \"analysis_args\": {\n        \"exp_os\": \"Ubuntu 22.04.2 LTS\"\n      }\n    },\n    \"PackagePlugin\": {\n          \"analysis_args\": {\n            \"exp_package_ver\": {\n              \"gcc\": \"11.4.0\"\n            },\n            \"regex_match\": false\n          }\n    },\n    \"RocmPlugin\": {\n      \"analysis_args\": {\n        \"exp_rocm\": \"6.5\"\n      }\n    }\n  },\n  \"result_collators\": {},\n  \"name\": \"plugin_config\",\n  \"desc\": \"My golden config\"\n}\n```\n\n#### Reference config: **'gen-reference-config' command**\nThis command can be used to generate a reference config that is populated with current system\nconfigurations. Plugins that use analyzer args (where applicable) will be populated with system\ndata.\nSample command:\n```sh\nnode-scraper --gen-reference-config run-plugins BiosPlugin OsPlugin\n\n```\nThis will generate the following config:\n```json\n{\n  \"global_args\": {},\n  \"plugins\": {\n    \"BiosPlugin\": {\n      \"analysis_args\": {\n        \"exp_bios_version\": [\n          \"M17\"\n        ],\n        \"regex_match\": false\n      }\n    },\n    \"OsPlugin\": {\n      \"analysis_args\": {\n        \"exp_os\": [\n          \"8.10\"\n        ],\n        \"exact_match\": true\n      }\n    }\n  },\n  \"result_collators\": {}\n```\nThis config can later be used on a different platform for comparison, using the steps at #2:\n```sh\nnode-scraper --plugin-configs reference_config.json\n\n```\n\nAn alternate way to generate a reference config is by using log files from a previous run. The\nexample below uses log files from 'scraper_logs_\u003cpath\u003e/':\n```sh\nnode-scraper gen-plugin-config --gen-reference-config-from-logs scraper_logs_\u003cpath\u003e/ --output-path custom_output_dir\n```\nThis will generate a reference config that includes plugins with logged results in\n'scraper_log_\u003cpath\u003e' and save the new config to 'custom_output_dir/reference_config.json'.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famd%2Fnode-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famd%2Fnode-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famd%2Fnode-scraper/lists"}