{"id":13843502,"url":"https://github.com/SomeKirill/wordlist_generator","last_synced_at":"2025-07-11T19:31:48.130Z","repository":{"id":183521838,"uuid":"288267829","full_name":"SomeKirill/wordlist_generator","owner":"SomeKirill","description":"Unique wordlist generator of unique wordlists.","archived":false,"fork":false,"pushed_at":"2023-07-20T15:10:06.000Z","size":348,"stargazers_count":43,"open_issues_count":1,"forks_count":11,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-08-05T17:37:42.901Z","etag":null,"topics":["bugbounty","bugbounty-tool","information-gathering","pentesting","reconnaissance","security","wordlist"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SomeKirill.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-08-17T19:23:45.000Z","updated_at":"2024-04-16T22:45:34.000Z","dependencies_parsed_at":"2023-07-24T19:58:24.538Z","dependency_job_id":null,"html_url":"https://github.com/SomeKirill/wordlist_generator","commit_stats":null,"previous_names":["somekirill/wordlist_generator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SomeKirill%2Fwordlist_generator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SomeKirill%2Fwordlist_generator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SomeKirill%2Fwordlist_generator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SomeKirill%2Fwordlist_generator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SomeKirill","download_url":"https://codeload.github.com/SomeKirill/wordlist_generator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225755010,"owners_count":17519186,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bugbounty","bugbounty-tool","information-gathering","pentesting","reconnaissance","security","wordlist"],"created_at":"2024-08-04T17:02:11.040Z","updated_at":"2024-11-21T15:31:02.942Z","avatar_url":"https://github.com/SomeKirill.png","language":"Python","funding_links":[],"categories":["Python (1887)","Python"],"sub_categories":[],"readme":"# wordlist_generator\n\nTool wordlist_generator generates unique to your target wordlist with techniques mentioned in tomnomnom's video [\"Who, What, Where, When\"](https://www.youtube.com/watch?v=W4_QCSIujQ4).\nIt takes URLs from [gau](https://github.com/lc/gau) to extract directories, file names or words on pages. As additional feature it can extract HTML comments. By default tool will only request 2000 URLs, extract all words and directories.\n\nTo clean wordlist, wordlist_generator removes from result everything from \"denylists\" directory files to keep only unique words. Also it cleans result using regexes from BonJarber's [clean_wordlist](https://github.com/BonJarber/SecUtils/tree/master/clean_wordlist) tool. You can adjust which extenctions will be ignored during parsing files and fetching pages in `parsing_allow_extensions.txt` and `scraping_deny_extensions.txt`.\n\n## Usage:\nExamples:\n```\n$ ./wordlist_generator.py -d hackerone.com -a 20 -files\n$ ./wordlist_generator.py -d bugcrowd.com -a 7500 -dir\n$ ./wordlist_generator.py -d intigriti.com \u003e intigriti_wordlist.txt\n```\nTo display the help for the tool use the -h flag:\n\n```\n./wordlist_generator.py -h\n```\n\n| Flag | Description | Example |\n|------|-------------|---------|\n| `-domain` | target domain | `./wordlist_generator.py -d openbugbounty.org` |\n| `-amount` | amount of URLs to fetch from gau | `./wordlist_generator.py -d twitter.com -a 10000` |\n| `-dir` | Extract only directories | `./wordlist_generator.py -d hackerone.com -dir` |\n| `-f` | Extract only filenames | `./wordlist_generator.py -d hackerone.com -f` |\n| `-c` | Extract only comments with no filtering | `./wordlist_generator.py -d hackerone.com -c` |\n\n\n## Installation:\n```\n$ GO111MODULE=on go get -u -v github.com/lc/gau\n$ git clone https://github.com/SomeKirill/wordlist_generator/\n$ pip3 install -r requirements.txt\n```\n## denylists wordlists used:\n- https://github.com/danielmiessler/SecLists/blob/master/Discovery/Web-Content/raft-large-directories-lowercase.txt\n- https://github.com/oprogramador/most-common-words-by-language/blob/master/src/resources/dutch.txt\n- https://github.com/first20hours/google-10000-english/blob/master/google-10000-english.txt\n- https://tools.ietf.org/html/rfc1866\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSomeKirill%2Fwordlist_generator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSomeKirill%2Fwordlist_generator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSomeKirill%2Fwordlist_generator/lists"}