{"id":13454447,"url":"https://github.com/securing/DumpsterDiver","last_synced_at":"2025-03-24T05:33:49.222Z","repository":{"id":41371387,"uuid":"135089098","full_name":"securing/DumpsterDiver","owner":"securing","description":"Tool to search secrets in various filetypes.","archived":false,"fork":false,"pushed_at":"2023-04-25T18:57:56.000Z","size":1889,"stargazers_count":979,"open_issues_count":11,"forks_count":152,"subscribers_count":31,"default_branch":"master","last_synced_at":"2024-10-28T21:39:02.017Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/securing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-05-27T23:33:27.000Z","updated_at":"2024-10-24T20:43:38.000Z","dependencies_parsed_at":"2024-01-12T03:05:46.743Z","dependency_job_id":"76d9ee94-d123-4533-839b-b6620c5d5f0e","html_url":"https://github.com/securing/DumpsterDiver","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/securing%2FDumpsterDiver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/securing%2FDumpsterDiver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/securing%2FDumpsterDiver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/securing%2FDumpsterDiver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/securing","download_url":"https://codeload.github.com/securing/DumpsterDiver/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245217434,"owners_count":20579291,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T08:00:54.153Z","updated_at":"2025-03-24T05:33:48.724Z","avatar_url":"https://github.com/securing.png","language":"Python","readme":"\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./static/images/DumpsterDiver_logo.png\" width=\"400\" alt=\"DumpsterDiver_logo\" /\u003e\n\u003c/p\u003e\n\nDumpsterDiver (by @Rzepsky)\n========================================\n\nDumpsterDiver is a tool, which can analyze big volumes of data in search of hardcoded secrets like keys (e.g. AWS Access Key, Azure Share Key or SSH keys) or passwords. Additionally, it allows creating a simple search rules with basic conditions (e.g. report only csv files including at least 10 email addresses).\nThe main idea of this tool is to detect any potential secret leaks. You can watch it in action in the [demo video](https://vimeo.com/398343810) or read about all its features in [this article](https://medium.com/@rzepsky/hunting-for-secrets-with-the-dumpsterdiver-93d38a9cd4c1).\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/xep624/DumpsterDiver/blob/master/static/images/dumpster_diver.png?raw=true\" alt=\"DumpsterDiver\" /\u003e\n\u003c/p\u003e\n\n### Key features:\n* it uses Shannon Entropy to find private keys,\n* it searches through git logs,\n* it unpacks compressed archives (e.g. zip, tar.gz etc.),\n* it supports advanced search using simple rules (details below),\n* it searches for hardcoded passwords,\n* it is fully customizable.\n\n### Usage\n\n```\nusage: DumpsterDiver.py [-h] -p LOCAL_PATH [-r] [-a] [-s] [-o OUTFILE]\n                        [--min-key MIN_KEY] [--max-key MAX_KEY]\n                        [--entropy ENTROPY] [--min-pass MIN_PASS]\n                        [--max-pass MAX_PASS]\n                        [--pass-complex {1,2,3,4,5,6,7,8,9}]\n                        [--exclude-files EXCLUDE_FILES [EXCLUDE_FILES ...]]\n                        [--bad-expressions BAD_EXPRESSIONS [BAD_EXPRESSIONS ...]]\n```\n\n\n### Basic command line options\n\n\n* `-p LOCAL_PATH` - path to the folder containing files to be analyzed.\n* `-a, --advance` - when this flag is set, then all files will be additionally analyzed using rules specified in `rules.yaml` file.\n* `-r, --remove` - when this flag is set, then files which don't contain any secret (or anything interesting if `-a` flag is set) will be removed.\n*  `-s, --secret` - when this flag is set, then all files will be additionally analyzed in search of hardcoded passwords.\n* `-o OUTFILE` -  output file in JSON format.\n\n### Pre-requisites\nTo run the DumpsterDiver you have to install python  libraries. You can do this by running the following command:\n\n```\n$\u003e pip install -r requirements.txt\n```\nIf you have installed separately Python 2 and 3 then you should use `pip3` or `pip3.6`.  \n\n### Customizing your search\nThere is no single tool which fits for everyone's needs and the DumpsterDiver is not an exception here. There are 2 ways to customize your search:\n\n* using command line parameters\n* using `config.yaml` file\n\n#### Customization via command line parameters\n\n* `--min-key MIN_KEY` - specifies the minimum key length to be analyzed (default is 20).\n* `--max-key MAX_KEY` - specifies the maximum key length to be analyzed (default is 80).\n* `--entropy ENTROPY` - specifies the edge of high entropy (default is 4.3).\n\nThere is also added a separate script which allows you to count an entropy of a character in a single word. It will help you to better customize the DumpsterDiver to your needs. You can check it using the following command:\n\n```\n$\u003e python3 entropy.py f2441e3810794d37a34dd7f8f6995df4\n```\n\nThis way is quite helpful when you know what you're looking for. Here are few examples:\n\n* When you're looking for AWS Secret Access Key:\n\n`$\u003e python3 DumpsterDiver.py -p [PATH_TO_FOLDER] --min-key 40 --max-key 40 --entropy 4.3` \n\n* When you're looking for Azure Shared Key:\n\n`$\u003e python3 DumpsterDiver.py -p [PATH_TO_FOLDER] --min-key 66 --max-key 66 --entropy 5.1`\n\n* When you're looking for SSH private key (by default RSA private key is written in 76 bytes long strings):\n\n`$\u003e python3 DumpsterDiver.py -p [PATH_TO_FOLDER] --min-key 76 --max-key 76 --entropy 5.1`\n\n* When you're looking for any occurence of `aws_access_key_id` or `aws_secret_access_key`:\n\n`$\u003e python3 DumpsterDiver.py -p ./test/ --grep-words *aws_access_key_id* *aws_secret_access_key* -a`  \n\n\u003e Please note that wildcards before and after a grep word is used on purpose. This way expressions like `\"aws_access_key_id\"` or `aws_access_key_id=` will be also reported. \n\n##### Finding hardcoded passwords\nUsing entropy for finding passwords isn't very effective as it generates a lot of false positives. This is why the DumpsterDiver uses a different attitude to find hardcoded passwords - it verifies the password complexity using [passwordmeter]('https://pypi.org/project/passwordmeter/'). To customize this search you can use the following commands:\n\n* `--min-pass MIN_PASS` - specifies the minimum password length to be analyzed (default is 8). Requires adding `-s` flag to the syntax.\n* `--max-pass MAX_PASS` - specifies the maximum password length to be analyzed (default is 12). Requires adding `-s` flag to the syntax.\n* `--pass-complex {1,2,3,4,5,6,7,8,9}` - specifies the edge of password complexity between 1 (trivial passwords) to 9 (very complex passwords) (default is 8). Requires adding `-s` flag to the syntax.\n\nFor example if you want to find complex passwords (which contains uppercase, lowercase, special character, digit and is 10 to 15 characters long), then you can do it using the following command:\n\n`$\u003e python3 DumpsterDiver.py -p [PATH_TO_FOLDER] --min-pass 10 --max-pass 15 --pass-complex 8`\n\n\n#####  Limiting scan \n\nYou may want to skip scanning certain files. For that purpose you can use the following parameters:\n\n* `--exclude-files` - specifies file names or extensions which shouldn't be analyzed. File extension should contain `.` character (e.g. `.pdf`). Multiple file names and extensions should be separated by space.\n\n* `--bad-expressions` - specifies bad expressions. If the DumpsterDiver find such expression in a file, then this file won't be\nanalyzed. Multiple bad expressions should be separated by space.\n\n\u003e If you want to specify multiple file names, bad expressions or grep words using a separated file you can do it via the following bash trick:\n\u003e ```\n\u003e $\u003e python3 DumpsterDiver.py -p ./test/ --exclude-files `while read -r line; do echo $line; done \u003c blacklisted_files.txt`\n\u003e ```\n\n#### Customization via config.yaml file\nInstead of using multiple command line parameters you can specify values for all the above-mentioned parameters at once in `config.yaml` file.\n\n### Advanced search:\nThe DumpsterDiver supports also an advanced search. Beyond a simple grepping with wildcards this tool allows you to create conditions. Let's assume you're searching for a leak of corporate emails. Additionaly, you're interested only in a big leaks, which contain at least 100 email addresses. For this purpose you should edit a `rules.yaml` file in the following way:\n\n```\nfiletype: [\".*\"]\nfiletype_weight: 0\ngrep_words: [\"*@example.com\"]\ngrep_words_weight: 10\ngrep_word_occurrence: 100\n```\n\nLet's assume a different scenario, you're looking for terms \"pass\",  \"password\", \"haslo\", \"hasło\" (if you're analyzing polish company repository) in a `.db` or `.sql` file. Then you can achieve this by modifying a 'rules.yaml' file in the following way:\n\n```\nfiletype: [\".db\", \".sql\"]\nfiletype_weight: 5\ngrep_words: [\"*pass*\", \"*haslo*\", \"*hasło*\"]\ngrep_words_weight: 5\ngrep_word_occurrence: 1\n```\n\nNote that the rule will be triggered only when the total weight (`filetype_weight + grep_words_weight`) is `\u003e=10`.\n\n### Using Docker\nA docker image is available for DumpsterDiver. Run it using:\n```\n$\u003e docker run -v /path/to/my/files:/files --rm rzepsky/dumpsterdiver -p /files\n```\nIf you want to override one of the configuration files (`config.yaml` or `rules.yaml`):\n```\n$\u003e docker run -v /path/to/my/config/config.yaml:/config.yaml /path/to/my/config/rules.yaml:/rules.yaml -v /path/to/my/files:/files --rm rzepsky/dumpsterdiver -p /files\n```\n\n### Contribution\n\nDo you have better ideas? Wanna help in this project? Please contact me via twitter [@Rzepsky](https://twitter.com/Rzepsky). I would be more than happy to see here any contributors!\n\n### Special thanks\nHere I'd like to thank so much all those who helped develop this project:\n\n* [Stephen Sorriaux](https://github.com/StephenSorriaux)\n* [Andres Riancho](https://twitter.com/w3af)\n* [Damian Stygar](https://github.com/DahDev)\n* [Disconnect3d](https://twitter.com/disconnect3d_pl)\n\n### License\n\nSee the [LICENSE](./LICENSE) file.\n","funding_links":[],"categories":["Content Discovery","Python","Uncategorized","Other Awesome Lists","Python (1887)","Application Security","[↑](#-content) 🛠️ Tools","[](#table-of-contents) Table of contents"],"sub_categories":["AWS S3 Bucket","Uncategorized","Secret Scanning","Secrets detection","[](#websites-files-metadata-analyze-and-files-downloads)Website's files metadata analyze and files downloads"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecuring%2FDumpsterDiver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsecuring%2FDumpsterDiver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecuring%2FDumpsterDiver/lists"}