{"id":20608555,"url":"https://github.com/logicalhacking/extensioncrawler","last_synced_at":"2025-07-20T12:06:09.810Z","repository":{"id":95337558,"uuid":"67743549","full_name":"logicalhacking/ExtensionCrawler","owner":"logicalhacking","description":"A collection of utilities for downloading and analyzing browser extension from the Chrome Web store.","archived":false,"fork":false,"pushed_at":"2023-10-10T19:46:51.000Z","size":853,"stargazers_count":19,"open_issues_count":1,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-15T04:17:48.845Z","etag":null,"topics":["chrome","chrome-extension"],"latest_commit_sha":null,"homepage":"https://git.logicalhacking.com/BrowserSecurity/ExtensionCrawler","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/logicalhacking.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-09-08T22:04:47.000Z","updated_at":"2024-11-17T04:06:16.000Z","dependencies_parsed_at":"2025-04-15T04:17:49.244Z","dependency_job_id":"1e06f1bb-0e3e-49fd-a929-e53673bf4c51","html_url":"https://github.com/logicalhacking/ExtensionCrawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/logicalhacking/ExtensionCrawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalhacking%2FExtensionCrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalhacking%2FExtensionCrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalhacking%2FExtensionCrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalhacking%2FExtensionCrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/logicalhacking","download_url":"https://codeload.github.com/logicalhacking/ExtensionCrawler/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logicalhacking%2FExtensionCrawler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266120409,"owners_count":23879315,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chrome","chrome-extension"],"created_at":"2024-11-16T10:11:00.473Z","updated_at":"2025-07-20T12:06:09.803Z","avatar_url":"https://github.com/logicalhacking.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ExtensionCrawler\n\nA collection of utilities for downloading and analyzing browser\nextension from the Chrome Web store.\n\n* `crawler`: A crawler for extensions from the Chrome Web Store.\n* `crx-tool`: A tool for analyzing and extracting `*.crx` files\n  (i.e., Chrome extensions). Calling `crx-tool.py \u003cextension\u003e.crx`\n  will check the integrity of the extension.\n* `crx-extract`: A simple tool for extracting `*.crx` files from the\n   tar-based archive hierarchy.\n* `crx-jsinventory`: Build a JavaScript inventory of a `*.crx` file using a\n                   JavaScript decomposition analysis.\n* `crx-jsstrings`: A tool for extracting code blocks, comment blocks, and\n                 string literals from JavaScript.\n* `create-db`: A tool for updating a remote MariaDB from already\n   existing extension archives.\n\nThe utilities store the extensions in the following directory\nhierarchy:\n\n```shell\n   archive\n   ├── conf\n   │   └── forums.conf\n   ├── data\n   │   └── ...\n   └── log\n       └── ...\n```\n\nThe crawler downloads the most recent extension (i.e., the `*.crx`\nfile as well as the overview page. In addition, the `conf` directory\nmay contain one file, called `forums.conf` that lists the ids of\nextensions for which the forums and support pages should be downloaded\nas well. The `data` directory will contain the downloaded extensions.\n\nThe `crawler` and `create-db` scripts will access and update a MariaDB.\nThey will use the host, datebase, and credentials found in `~/.my.cnf`.\nSince they make use of various JSON features, it is recommended to use at\nleast version 10.2.8 of MariaDB.\n\nAll utilities are written in Python 3.7. The required modules are listed\nin the file `requirements.txt`.\n\n## Installation\n\nClone and use pip3 to install as a package.\n\n```shell\ngit clone git@logicalhacking.com:BrowserSecurity/ExtensionCrawler.git\npip3 install --user -e ExtensionCrawler\n```\n\n## Team\n\n* [Achim D. Brucker](http://www.brucker.ch/)\n* [Michael Herzberg](http://www.dcs.shef.ac.uk/cgi-bin/makeperson?M.Herzberg)\n\n### Contributors\n\n* Mehmet Balande\n\n## License\n\nThis project is licensed under the GPL 3.0 (or any later version).\n\nSPDX-License-Identifier: GPL-3.0-or-later\n\n## Master Repository\n\nThe master git repository for this project is hosted by the [Software\nAssurance \u0026 Security Research Team](https://logicalhacking.com) at\n\u003chttps://git.logicalhacking.com/BrowserSecurity/ExtensionCrawler\u003e.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicalhacking%2Fextensioncrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flogicalhacking%2Fextensioncrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogicalhacking%2Fextensioncrawler/lists"}