{"id":15023927,"url":"https://github.com/sap/credential-digger","last_synced_at":"2025-05-15T04:06:54.276Z","repository":{"id":37401845,"uuid":"247969824","full_name":"SAP/credential-digger","owner":"SAP","description":"A Github scanning tool that identifies hardcoded credentials while filtering the false positive data through machine learning models :lock:","archived":false,"fork":false,"pushed_at":"2025-03-31T22:23:56.000Z","size":5377,"stargazers_count":343,"open_issues_count":29,"forks_count":53,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-05-11T05:28:56.929Z","etag":null,"topics":["credentials","machine-learning","python","regex","scanner","secret","security","security-tools"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SAP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-03-17T12:57:24.000Z","updated_at":"2025-05-09T18:17:23.000Z","dependencies_parsed_at":"2023-11-14T17:24:03.383Z","dependency_job_id":"cf6a015f-116b-4463-b02c-4424764242e5","html_url":"https://github.com/SAP/credential-digger","commit_stats":{"total_commits":861,"total_committers":16,"mean_commits":53.8125,"dds":0.610917537746806,"last_synced_commit":"6f88f8b9cb480acb2cbbb99f4a04c080e301086b"},"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SAP%2Fcredential-digger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SAP%2Fcredential-digger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SAP%2Fcredential-digger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SAP%2Fcredential-digger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SAP","download_url":"https://codeload.github.com/SAP/credential-digger/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254270646,"owners_count":22042859,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["credentials","machine-learning","python","regex","scanner","secret","security","security-tools"],"created_at":"2024-09-24T19:59:37.059Z","updated_at":"2025-05-15T04:06:49.261Z","avatar_url":"https://github.com/SAP.png","language":"Python","readme":"[![REUSE status](https://api.reuse.software/badge/github.com/SAP/credential-digger)](https://api.reuse.software/info/github.com/SAP/credential-digger)\n![GitHub release (latest by date)](https://img.shields.io/github/v/release/SAP/credential-digger?logo=github)\n![PyPI](https://img.shields.io/pypi/v/credentialdigger?logo=pypi)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/credentialdigger?logo=python)\n[![Docker](https://badgen.net/badge/icon/docker?icon=docker\u0026label\u0026color=0db7ed)](https://hub.docker.com/r/saposs/credentialdigger)\n[![Visual Studio Plugin](https://badgen.net/badge/icon/visualstudio?icon=visualstudio\u0026label)](https://marketplace.visualstudio.com/items?itemName=SAPOSS.vs-code-extension-for-project-credential-digger)\n\n\n![Logo](https://raw.githubusercontent.com/SAP/credential-digger/main/github_assets/Logo-CD-Mint_48.png)\n\n# Credential Digger\n\nCredential Digger is a GitHub scanning tool that identifies hardcoded credentials (Passwords, API Keys, Secret Keys, Tokens, personal information, etc), filtering the false positive data through machine learning models.\n\nTLDR; watch the video ⬇️\n\n[![Watch the video](https://img.youtube.com/vi/1qz8lYPrtMo/0.jpg)](https://www.youtube.com/watch?v=1qz8lYPrtMo)\n\n\n\n- [Credential Digger](#credential-digger)\n  - [Why](#why)\n  - [Requirements](#requirements)\n  - [Download and Installation](#download-and-installation)\n  - [How to run](#how-to-run)\n    - [Add rules](#add-rules)\n    - [Scan a repository](#scan-a-repository)\n  - [Docker container](#docker-container)\n  - [Advanced Installation](#advanced-installation)\n    - [Build from source](#build-from-source)\n    - [External postgres database](#external-postgres-database)\n  - [How to update the project](#how-to-update-the-project)\n  - [Python library usage](#python-library-usage)\n    - [Add rules](#add-rules-1)\n    - [Scan a repository](#scan-a-repository-1)\n  - [CLI - Command Line Interface](#cli---command-line-interface)\n  - [Micosoft Visual Studio Plug-in](#Micosoft-Visual-Studio-Plugin)\n  - [pre-commit hook](#pre-commit-hook)\n  - [CI/CD Pipeline Intergation on Piper](#cicd-pipeline-intergation-on-piper)\n  - [Wiki](#wiki)\n  - [Contributing](#contributing)\n  - [How to obtain support](#how-to-obtain-support)\n  - [News](#news)\n\n## Why\nIn data protection, one of the most critical threats is represented by hardcoded (or plaintext) credentials in open-source projects. Several tools are already available to detect leaks in open-source platforms, but the diversity of credentials (depending on multiple factors such as the programming language, code development conventions, or developers' personal habits) is a bottleneck for the effectiveness of these tools. Their lack of precision leads to a very high number of pieces of code incorrectly detected as leaked secrets. Data wrongly detected as a leak is called _false positive_ data, and compose the huge majority of the data detected by currently available tools.\n\nThe goal of Credential Digger is to reduce the amount of false positive data on the output of the scanning phase by leveraging machine learning models.\n\n![Architecture](https://raw.githubusercontent.com/SAP/credential-digger/main/github_assets/credential-digger-architecture.png)\n\n\nThe tool supports several scan flavors: public and private repositories on\ngithub and gitlab, pull requests, wiki pages, github organizations, local git repositories, local files and folders.\nPlease refer to the [Wiki](https://github.com/SAP/credential-digger/wiki) for the complete documentation.\n\nFor the complete description of the approach of Credential Digger (versions \u003c4.4), [you can read this publication](https://www.scitepress.org/Papers/2021/102381/102381.pdf).\n\n```\n@InProceedings {lrnto-icissp21,\n    author = {S. Lounici and M. Rosa and C. M. Negri and S. Trabelsi and M. Önen},\n    booktitle = {Proc. of the 8th The International Conference on Information Systems Security and Privacy  (ICISSP)},\n    title = {Optimizing Leak Detection in Open-Source Platforms with Machine Learning Techniques},\n    month = {February},\n    day = {11-13},\n    year = {2021}\n}\n```\n\n## Requirements\n\nCredential Digger supports Python \u003e= 3.8 and \u003c 3.13, and works only with Linux and MacOS systems.\nIn case you don't meet these requirements, you may consider running a [Docker container](#docker) (that also includes a user interface).\n\n## Download and Installation\n\nFirst, you need to install some dependencies (namely, `build-essential` and `python3-dev`). No need to explicitely install hyperscan anymore.\n\n```bash\nsudo apt install -y build-essential python3-dev\n```\n\nThen, you can install Credential Digger module using `pip`.\n\n```bash\npip install credentialdigger\n```\n\n\u003e For ARM machines (e.g., new MacBooks), installation is possible [following this guide](https://github.com/SAP/credential-digger/wiki/MacOS-ARM-Installation)\n\n\n## How to run\n\n### Add rules\n\nOne of the core components of Credential Digger is the regular expression scanner. You can choose the regular expressions rules you want (just follow the template [here](https://github.com/SAP/credential-digger/blob/main/ui/backend/rules.yml)). We provide a list of patterns in the `rules.yml` file, that are included in the UI. The scanner supports rules of 4 different categories: `password`, `token`, `crypto_key`, and `other`.\n\n**Before the very first scan, you need to add the rules that will be used by the scanner.** This step is only needed once.\n\n```bash\ncredentialdigger add_rules --sqlite /path/to/data.db /path/to/rules.yaml\n```\n\n### Scan a repository\n\nAfter adding the rules, you can scan a repository:\n\n```bash\ncredentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db\n```\n\nMachine learning models are not mandatory, but highly recommended in order to reduce the manual effort of reviewing the result of a scan:\n\n```bash\ncredentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db --models PathModel PasswordModel\n```\n\nAs for the models, also the similarity feature is not mandatory, but highly recommended in order to reduce the manual effort while assessing the discoveries after a scan:\n\n```bash\ncredentialdigger scan https://github.com/user/repo --sqlite /path/to/data.db --similarity --models PathModel PasswordModel\n```\n\n\n## Docker container\n\nTo have a ready-to-use instance of Credential Digger, with a user interface, you can use a docker container. \nThis option requires the installation of [Docker](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/).\n\nCredential Digger is published on [dockerhub](https://hub.docker.com/r/saposs/credentialdigger). You can pull the latest release\n\n```bash\nsudo docker pull saposs/credentialdigger\n```\n\nOr build and run containers with docker compose\n\n```bash\ngit clone https://github.com/SAP/credential-digger.git\ncd credential-digger\ncp .env.sample .env\ndocker compose up --build\n```\n\nThe UI is available at [http://localhost:5000/](http://localhost:5000/)\n\n\u003e It is preferrable to have at least 8 GB of RAM free when using docker containers\n\n\n## Advanced Installation\n\nCredential Digger is modular, and offers a wide choice of components and adaptations.\n\n### Build from source\n\nAfter installing the [dependencies](#install-dependencies) listed above, you can install Credential Digger as follows.\n\nConfigure a virtual environment for Python 3 (optional) and clone the main branch of the project:\n\n```bash\nvirtualenv -p python3 ./venv\nsource ./venv/bin/activate\n\ngit clone https://github.com/SAP/credential-digger.git\ncd credential-digger\n```\n\nInstall the tool from source:\n\n```bash\npip install .\n```\n\nThen, you can add the rules and scan a repository as described above.\n\n### External postgres database\n\nAnother ready-to-use instance of Credential Digger with the UI, but using a dockerized postgres database instead of a local sqlite one:\n\n```bash\ngit clone https://github.com/SAP/credential-digger.git\ncd credential-digger\ncp .env.sample .env\nvim .env  # set credentials for postgres\ndocker compose -f docker-compose.postgres.yml up --build\n```\n\n\u003e **WARNING**: Differently from the sqlite version, here we need to configure the `.env` file with the credentials for postgres (by modifying `POSTGRES_USER`, `POSTGRES_PASSWORD` and `POSTGRES_DB`).\n\nMost advanced users may also wish to use an external postgres database instead of the dockerized one we provide in our `docker-compose.postgres.yml`.\n\n\n\n## How to update the project\nIf you are already running Credential Digger and you want to update it to a\nnewer version, you can \n[refer to the wiki for the needed steps](https://github.com/SAP/credential-digger/wiki/How-to-update-Credential-Digger).\n\n\n\n## Python library usage\n\nWhen installing _credentialdigger_ from pip (or from source), you can instantiate the client and scan a repository.\n\nInstantiate the client proper for the chosen database:\n\n```python\n# Using a Sqlite database\nfrom credentialdigger import SqliteClient\nc = SqliteClient(path='/path/to/data.db')\n\n# Using a postgres database\nfrom credentialdigger import PgClient\nc = PgClient(dbname='my_db_name',\n             dbuser='my_user',\n             dbpassword='my_password',\n             dbhost='localhost_or_ip',\n             dbport=5432)\n```\n\n### Add rules\n\nAdd rules before launching your first scan.\n\n```python\nc.add_rules_from_file('/path/to/rules.yml')\n```\n\n### Scan a repository\n\n```python\nnew_discoveries = c.scan(repo_url='https://github.com/user/repo',\n                         models=['PathModel', 'PasswordModel'],\n                         debug=True)\n```\n\n\u003e  **WARNING**: Make sure you add the rules before your first scan.\n\nPlease refer to the [Wiki](https://github.com/SAP/credential-digger/wiki) for further information on the arguments.\n\n\n\n## CLI - Command Line Interface\n\nCredential Digger also offers a simple CLI to scan a repository. The CLI supports both sqlite and postgres databases. In case of postgres, you need either to export the credentials needed to connect to the database as environment variables or to setup a `.env` file. In case of sqlite, the path of the db must be passed as argument.\n\nRefer to the [Wiki](https://github.com/SAP/credential-digger/wiki) for all the supported commands and their usage.\n\n\n## Micosoft Visual Studio Plugin\n\nVS Code extension for project \"Credential Digger\" is a free IDE extension that let you detect secrets and credentials in your code before they get leaked! Like a spell checker, the extension scans your files using the Credential Digger and highlights the secrets as you write code, so you can fix them before the code is even committed.\n\nThe VS Code extension can be donwloaded from the [Microsoft VS Code Marketplace](https://marketplace.visualstudio.com/items?itemName=SAPOSS.vs-code-extension-for-project-credential-digger)   \n\n![VSCODE](https://github.com/SAP/credential-digger/blob/main/github_assets/credential-digger-how-it-works.gif)\n\n\n## pre-commit hook\n\nCredential Digger can be used with the [pre-commit](https://pre-commit.com/) framework to scan staged files before each commit.\n\nPlease, refer to the [Wiki page of the pre-commit hook](https://github.com/SAP/credential-digger/wiki/pre-commit-hook) for further information on its installation and execution.\n\n\n## CI/CD Pipeline Intergation on Piper (SAP Jenkins Library)\n\n![Piper](https://github.com/SAP/credential-digger/blob/main/github_assets/piper.png)\n\nCredential Digger is intergrated with the continuous delivery CI/CD pipeline [Piper](https://www.project-piper.io/) in order to automate secrets scans for your Github projects and repositories.\nIn order to activate the Credential Diggger Step please refer to this [Credential Digger step documentation for Piper](https://www.project-piper.io/steps/credentialdiggerScan/)\n\n\n## Wiki\n\nFor further information, please refer to the [Wiki](https://github.com/SAP/credential-digger/wiki)\n\n\n## Contributing\n\nWe invite your participation to the project through issues and pull requests. Please refer to the [Contributing guidelines](https://github.com/SAP/credential-digger/blob/main/CONTRIBUTING.md) for how to contribute.\n\n\n\n## How to obtain support\n\nAs a first step, we suggest to [read the wiki](https://github.com/SAP/credential-digger/wiki).\nIn case you don't find the answers you need, you can open an [issue](https://github.com/SAP/credential-digger/issues) or contact the [maintainers](https://github.com/SAP/credential-digger/blob/main/setup.py#L19).\n\n\n\n## News\n\n-  [Credential Digger announcement](https://blogs.sap.com/2020/06/23/credential-digger-using-machine-learning-to-identify-hardcoded-credentials-in-github)\n-  [Credential Digger is now supporting Keras machine learning models](https://github.com/SAP/credential-digger/tree/keras_models)\n-  [Credential Digger approach has been published at ICISSP 2021 conference](https://www.scitepress.org/Papers/2021/102381/102381.pdf)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsap%2Fcredential-digger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsap%2Fcredential-digger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsap%2Fcredential-digger/lists"}