Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/SomeKirill/wordlist_generator
Unique wordlist generator of unique wordlists.
https://github.com/SomeKirill/wordlist_generator
bugbounty bugbounty-tool information-gathering pentesting reconnaissance security wordlist
Last synced: about 2 months ago
JSON representation
Unique wordlist generator of unique wordlists.
- Host: GitHub
- URL: https://github.com/SomeKirill/wordlist_generator
- Owner: SomeKirill
- License: apache-2.0
- Created: 2020-08-17T19:23:45.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-07-20T15:10:06.000Z (over 1 year ago)
- Last Synced: 2024-08-05T17:37:42.901Z (5 months ago)
- Topics: bugbounty, bugbounty-tool, information-gathering, pentesting, reconnaissance, security, wordlist
- Language: Python
- Homepage:
- Size: 340 KB
- Stars: 43
- Watchers: 2
- Forks: 11
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-hacking-lists - SomeKirill/wordlist_generator - Unique wordlist generator of unique wordlists. (Python)
README
# wordlist_generator
Tool wordlist_generator generates unique to your target wordlist with techniques mentioned in tomnomnom's video ["Who, What, Where, When"](https://www.youtube.com/watch?v=W4_QCSIujQ4).
It takes URLs from [gau](https://github.com/lc/gau) to extract directories, file names or words on pages. As additional feature it can extract HTML comments. By default tool will only request 2000 URLs, extract all words and directories.To clean wordlist, wordlist_generator removes from result everything from "denylists" directory files to keep only unique words. Also it cleans result using regexes from BonJarber's [clean_wordlist](https://github.com/BonJarber/SecUtils/tree/master/clean_wordlist) tool. You can adjust which extenctions will be ignored during parsing files and fetching pages in `parsing_allow_extensions.txt` and `scraping_deny_extensions.txt`.
## Usage:
Examples:
```
$ ./wordlist_generator.py -d hackerone.com -a 20 -files
$ ./wordlist_generator.py -d bugcrowd.com -a 7500 -dir
$ ./wordlist_generator.py -d intigriti.com > intigriti_wordlist.txt
```
To display the help for the tool use the -h flag:```
./wordlist_generator.py -h
```| Flag | Description | Example |
|------|-------------|---------|
| `-domain` | target domain | `./wordlist_generator.py -d openbugbounty.org` |
| `-amount` | amount of URLs to fetch from gau | `./wordlist_generator.py -d twitter.com -a 10000` |
| `-dir` | Extract only directories | `./wordlist_generator.py -d hackerone.com -dir` |
| `-f` | Extract only filenames | `./wordlist_generator.py -d hackerone.com -f` |
| `-c` | Extract only comments with no filtering | `./wordlist_generator.py -d hackerone.com -c` |## Installation:
```
$ GO111MODULE=on go get -u -v github.com/lc/gau
$ git clone https://github.com/SomeKirill/wordlist_generator/
$ pip3 install -r requirements.txt
```
## denylists wordlists used:
- https://github.com/danielmiessler/SecLists/blob/master/Discovery/Web-Content/raft-large-directories-lowercase.txt
- https://github.com/oprogramador/most-common-words-by-language/blob/master/src/resources/dutch.txt
- https://github.com/first20hours/google-10000-english/blob/master/google-10000-english.txt
- https://tools.ietf.org/html/rfc1866