{"id":19745519,"url":"https://github.com/cossacklabs/repometascore","last_synced_at":"2026-03-12T16:01:50.037Z","repository":{"id":78995436,"uuid":"475586083","full_name":"cossacklabs/repometascore","owner":"cossacklabs","description":"repometascore (aka repository metadata scoring) analyzes metadata of the given repository, collects info about its contributors, and outputs the risk level.","archived":false,"fork":false,"pushed_at":"2025-03-31T09:19:11.000Z","size":761,"stargazers_count":35,"open_issues_count":0,"forks_count":0,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-30T07:43:43.714Z","etag":null,"topics":["dependency-analysis","dependency-manager","security-tools"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cossacklabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-03-29T19:14:16.000Z","updated_at":"2025-03-12T15:12:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"d441e3f5-4dc5-4c4e-a469-0b9c6c408b8a","html_url":"https://github.com/cossacklabs/repometascore","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/cossacklabs/repometascore","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cossacklabs%2Frepometascore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cossacklabs%2Frepometascore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cossacklabs%2Frepometascore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cossacklabs%2Frepometascore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cossacklabs","download_url":"https://codeload.github.com/cossacklabs/repometascore/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cossacklabs%2Frepometascore/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261199991,"owners_count":23123917,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dependency-analysis","dependency-manager","security-tools"],"created_at":"2024-11-12T02:09:20.595Z","updated_at":"2026-03-12T16:01:45.010Z","avatar_url":"https://github.com/cossacklabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RepoMetaScore\n\nUse RepoMetaScore (repository metadata scoring) to find risky projects in your dependency chain.\n\n![rch-github-logo](/pics/repometascore-logo-github.png)\n\n---\n\n## The main idea\n\nThis package helps to prevent supply chain risks by analyzing _metadata_ about the repository and its contributors. \n\nOpen-source maintainers weaponize their projects by introducing backdoors and vulnerabilities in the source code. Aside from being led by criminal and activist motivations, maintainers who live in regions with oppressive governments might be forced to introduce backdoors involuntarily. \n\nRepoMetaScore analyses the given repository, collects information about its maintainers and contributors, and outputs the \"risk rating\". All info about contributors is collected through the official GitHub API, and other public sources, and is solely based on the information users provide in their accounts.\n\n## How it works\n\nYou install the package, provide a link to the repository-in-question and check the output. The output contains risk ratings and info about each contributor. You decide whether to use the repository in your product.\n\nThe default configuration uses a growing list of criteria to identify potentially problematic repositories: maintainers’ GitHub and Twitter profiles, location, commit history, email domain, etc. Use RepoMetaScore as a manual tool for one-time check, or change it to be a part of your CICD pipeline.\n\n⚠️ _The configurations are rather raw and still work in progress. Feel free to contribute!_\n\n\n## Installation\n\nRequirements: Debian, Ubuntu, or Mac. Python 3.8+ installed.\n\nInstall RepoMetaScore via pip:\n\n```\npip3 install git+https://github.com/cossacklabs/repometascore.git@release \n```\n\nor alternatively as zip:\n```\npip3 install https://github.com/cossacklabs/repometascore/archive/release.zip \n```\n\nℹ️ _In order to get latest stable, download product from `release` branch, to get latest working version, use `main` branch._\n\n\n## Usage\n\n1. [Follow GitHub guide](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) to create new personal access token. We recommend creating a new, clean, token without any permissions.\n\n2. Copy created token into a separate file, call it `token_file.txt`. If you want to extend the limitation of requests to GitHub API - you can create additional tokens from the **OTHER** GitHub account. And add it in the next line to `token_file.txt` \n\n3. Run RepoMetaScore with default config, point it to the repository-in-question and provide a path to your `token_file.txt`:\n\n```\npython3 -m repometascore --url https://github.com/yandex/yandex-tank --tokenfile token_file.txt\n```\n\n4. The output is controlled by verbose parameter. By default, the verbose level is 0, which means the shortest output. To control verbosity, use `-v` param:\n   - nothing \t- verbose with level 0. Output only risk level and percentage.\n   - `-v`   \t- verbose with level 1. Additionally to the ‘zero’ level, output info about the program and commits, code delta, and contributors risk ratio.\n   - `-vv`  \t- verbose with level 2. Additionally to the previous level, output info about every risky contributor.\n   - `-vvv` \t- verbose with level 3. Additionally outputs info about every contributor. Currently, it only works with `JSON`-type output.\n   - `-vvvv` \t- verbose with level 4. Additionally shows different log outputs of this program.\n\nEnjoy the output and make decision whether to use this repository for your project.\n\n## Customisation\n\nYou can create your own configuration with specific rules in it, and specific your GitHub security token in that configuration file.\n\n1. Copy the default configuration file [config.json](https://github.com/cossacklabs/repometascore/blob/main/examples/config.json).\n\n2. Update `git_token` value to have your GitHub token: `\"git_token\": \"ghp_KvDv...\"`\n\n3. Run RepoMetaScore with your config:\n\n```\npython3 -m repometascore --url https://github.com/yandex/yandex-tank --config config.json\n```\n\n---\n\n## Configuration file\n\nConfiguration file should be a valid JSON file that contains a JSON dictionary.\n\nVariables that are used in the config file:\n\n### Root\n| Variable              | Type         | Description                                                                                                                                                                                                                             | \n|-----------------------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `risk_boundary_value` | `float`      | Optional. Default `0.95`. Used in Repo. Sets boundary value that helps us define whether we should consider contributors as risky or not. It compares the `Contributor.riskRating` value with the boundary value.                       |\n| `git_tokens`          | `List[str]`  | Your GitHub tokens as a list of strings. If you want to extend the limitation of requests to GitHub API - you can create additional tokens from the **OTHER** GitHub account. And add it as next `str` variable into `git_tokens` list. |\n| `request_max_retries` | `int`        | Optional. Default `5`. It shows how often we should try to reconnect to some kinds of requests.                                                                                                                                         |\n| `request_min_await`   | `float`      | Optional. Default `5.0`. Minimum wait time (in seconds) when a remote server responds with timeouts.                                                                                                                                    |\n| `request_max_await`   | `float`      | Optional. Default `15.0`. Maximum wait time (in seconds) when a remote server responds with timeouts.                                                                                                                                   |\n| `fields`              | `List[Dict]` | List of fields with rules. More details about this variable are in the next section.                                                                                                                                                    |\n\n### Fields\n| Variable | Type         | Description                                                                                                                                                                                          | \n|----------|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `name`   | `str`        | Must be the same as the property name in the `Contributor` class. Otherwise, nothing would happen. In case of success, it pulls data from variables in the `Contributor` class and operates with it. |\n| `rules`  | `List[Dict]` | List of rules that would append onto data gathered from the `name` variable from the `Contributor` class.                                                                                            |\n### Rules\n| Variable     | Type        | Description                                                                                                                                                                                           | \n|--------------|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `triggers`   | `List[str]` | Currently this is a list of strings. The program takes data (strings) from the contributor class. Modifies it to a lowercase string. And then checks if data from contributors matches every trigger. |\n| `type`       | `str`       | Verbose string name that can help the user understand what type of rule has been detected (e.g. `Strong`, `Considerable`, `Weak`, etc.).                                                              |\n| `risk_value` | `float`     | This value accumulates to `Contributor.riskRating` variable. Also can be a negative one for some extra cases.                                                                                         |\n\n## Environmental Variables\n| Variable               | Type  | Description                                                                                                                                                                                                |\n|------------------------|-------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `HTTP_REQUEST_TIMEOUT` | `int` | Optional. Default `20`. Time (in seconds) of an HTTP connection timeouts.                                                                                                                                  |\n| `ALLOWED_TIME_TO_WAIT` | `int` | Optional. Default `12000`. Time (in seconds) shows how long should we wait for GitHub API token reset. If token reset time is longer than `ALLOWED_TIME_TO_WAIT` exception `NoTokensLeft` would be thrown. |\n\n---\n\n# Next steps\n\n1. Improving location parsing \u0026 scores.\n2. Adding more checks, improving dictionaries.\n3. Adding risk factor based on comments language.\n4. Adding checks inspired by [What are Weak Links in the npm Supply Chain?](https://arxiv.org/abs/2112.10165) paper.\n\n---\n\n# License\n\n\"RepoMetaScore\" is distributed under the terms of the Apache License (Version 2.0).\n\nThis software is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n\n---\n\n# Contributing\n\nFeel free to extend the configuration, rules, scoring and come back with PRs. Also, we are welcome contributions that aimed at automation: add to CICD, add to GitHub plugins, etc. \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcossacklabs%2Frepometascore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcossacklabs%2Frepometascore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcossacklabs%2Frepometascore/lists"}