{"id":13540407,"url":"https://github.com/initstring/passphrase-wordlist","last_synced_at":"2025-04-08T17:17:31.223Z","repository":{"id":46609245,"uuid":"113231811","full_name":"initstring/passphrase-wordlist","owner":"initstring","description":"Passphrase wordlist and hashcat rules for offline cracking of long, complex passwords","archived":false,"fork":false,"pushed_at":"2023-11-14T11:46:14.000Z","size":368,"stargazers_count":1281,"open_issues_count":2,"forks_count":171,"subscribers_count":38,"default_branch":"master","last_synced_at":"2025-04-01T15:14:40.356Z","etag":null,"topics":["hacking","infosec","password-cracking","penetration-testing","pentesting","wordlist"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/initstring.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-05T20:53:13.000Z","updated_at":"2025-03-31T10:59:37.000Z","dependencies_parsed_at":"2024-01-16T22:21:53.953Z","dependency_job_id":"7bdd23c0-c8d9-4c30-a01c-256ce8966245","html_url":"https://github.com/initstring/passphrase-wordlist","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/initstring%2Fpassphrase-wordlist","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/initstring%2Fpassphrase-wordlist/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/initstring%2Fpassphrase-wordlist/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/initstring%2Fpassphrase-wordlist/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/initstring","download_url":"https://codeload.github.com/initstring/passphrase-wordlist/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247888559,"owners_count":21013001,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacking","infosec","password-cracking","penetration-testing","pentesting","wordlist"],"created_at":"2024-08-01T09:01:49.412Z","updated_at":"2025-04-08T17:17:31.204Z","avatar_url":"https://github.com/initstring.png","language":"Python","funding_links":[],"categories":["\u003ca id=\"609214b7c4d2f9bb574e2099313533a2\"\u003e\u003c/a\u003ewordlist","Python (1887)","Python"],"sub_categories":["\u003ca id=\"af1d71122d601229dc4aa9d08f4e3e15\"\u003e\u003c/a\u003e未分类-wordlist"],"readme":"# Overview\n\nPeople think they are getting smarter by using passphrases. Let's prove them wrong!\n\nThis project includes a massive wordlist of phrases (over 20 million) and two hashcat rule files for GPU-based cracking. The rules will create over 1,000 permutations of each phase.\n\nTo use this project, you need:\n\n- The wordlist `passphrases.txt`, which you can find under [releases](https://github.com/initstring/passphrase-wordlist/releases).\n- Both hashcat rules [here](/hashcat-rules/).\n\n**WORDLIST LAST UPDATED**: November 2022\n\n# Usage\n\nGenerally, you will use with hashcat's `-a 0` mode which takes a wordlist and allows rule files. It is important to use the rule files in the correct order, as rule #1 mostly handles capital letters and spaces, and rule #2 deals with permutations.\n\nHere is an example for NTLMv2 hashes: If you use the `-O` option, watch out for what the maximum password length is set to - it may be too short.\n\n```\nhashcat -a 0 -m 5600 hashes.txt passphrases.txt -r passphrase-rule1.rule -r passphrase-rule2.rule -O -w 3\n```\n\n# Sources Used\n\nSome sources are pulled from a static dataset, like a Kaggle upload. Others I generate myself using various scripts and APIs. I might one day automate that via CI, but for now you can see how I update the dynamic sources [here](/utilities/updating-sources.md).\n\n| \u003cins\u003e**source file name**\u003c/ins\u003e | \u003cins\u003e**source type**\u003c/ins\u003e | \u003cins\u003e**description**\u003c/ins\u003e |\n| --- | --- | --- |\n| wiktionary-2022-11-19.txt | dynamic | Article titles scraped from Wiktionary's index dump [here.](https://dumps.wikimedia.org/enwiktionary) |\n| wikipedia-2022-11-19.txt | dynamic | Article titles scraped from the Wikipedia `pages-articles-multistream-index` dump generated 29-Sept-2021 [here.](https://dumps.wikimedia.org/enwiki) |\n| urban-dictionary-2022-11-19.txt | dynamic | Urban Dictionary dataset pulled using [this script](https://github.com/mattbierner/urban-dictionary-word-list). |\n| know-your-meme-2022-11-19.txt | dynamic | Meme titles from KnownYourMeme scraped using my tool [here.](/utilities/kym_scrape.py) |\n| imdb-titles-2022-11-19.txt | dynamic | IMDB dataset using the \"primaryTitle\" column from `title.basics.tsv.gz` file available [here](https://datasets.imdbws.com/) |\n| global-poi-2022-11-19.txt | dynamic | [Global POI dataset](https://download.geonames.org/export/dump/) using the 'allCountries' file from 29-Sept-2021. |\n| billboard-titles-2022-11-19.txt | dynamic | Album and track names using [Ultimate Music Database](https://www.umdmusic.com/), scraped with [a fork of mwkling's tool](https://github.com/initstring/umdmusic-downloader), modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts. |\n| billboard-artists-2022-11-19.txt | dynamic | Artist names using [Ultimate Music Database](https://www.umdmusic.com/), scraped with [a fork of mwkling's tool](https://github.com/initstring/umdmusic-downloader), modified to grab Billboard Singles (1940-2021) and Billboard Albums (1970-2021) charts. |\n| book.txt | static | Kaggle dataset with titles from over 300,000 books. |\n| rstone-top-100.txt | static\u003cbr\u003e(could be dynamic in future) | Song lyrics for Rolling Stone's \"top 100\" artists using my [lyric scraping tool](https://github.com/initstring/lyricpass). |\n| cornell-movie-titles-raw.txt | static | Movie titles from this [Cornell project](https://www.cs.cornell.edu/~cristian//Cornell_Movie-Dialogs_Corpus.html). |\n| cornell-movie-lines.txt | static | Movie lines from this [Cornell project](https://www.cs.cornell.edu/~cristian//Cornell_Movie-Dialogs_Corpus.html). |\n| author-quotes-raw.txt | static | [Quotables](https://www.kaggle.com/alvations/quotables) dataset on Kaggle. |\n| 1800-phrases-raw.txt | static | [1,800 English Phrases.](https://www.phrases.org.uk/meanings/phrases-and-sayings-list.html) |\n| 15k-phrases-raw.txt | static | [15,000 Useful Phrases.](https://www.gutenberg.org/ebooks/18362) |\n\n# Hashcat Rules\n\nThe rule files are designed to both \"shape\" the password and to mutate it. Shaping is based on the idea that human beings follow fairly predictable patterns when choosing a password, such as capitalising the first letter of each word and following the phrase with a number or special character. Mutations are also fairly predictable, such as replacing letters with visually-similar special characters.\n\nGiven the phrase `take the red pill` the first hashcat rule will output the following:\n\n```\ntake the red pill\ntake-the-red-pill\ntake.the.red.pill\ntake_the_red_pill\ntaketheredpill\nTake the red pill\nTAKE THE RED PILL\ntAKE THE RED PILL\nTaketheredpill\ntAKETHEREDPILL\nTAKETHEREDPILL\nTake The Red Pill\nTakeTheRedPill\nTake-The-Red-Pill\nTake.The.Red.Pill\nTake_The_Red_Pill\n```\n\nAdding in the second hashcat rule makes things get a bit more interesting. That will return a huge list per candidate. Here are a couple examples:\n\n```\nT@k3Th3R3dPill!\nT@ke-The-Red-Pill\ntaketheredpill2020!\nT0KE THE RED PILL\n```\n\n# Additional Info\n\nOptionally, some researchers might be interested in the script I use to clean the raw sources into the wordlist [here](/utilities/cleanup.py).\n\nThe cleanup script works like this:\n\n```\n$ python3.6 cleanup.py infile.txt outfile.txt\nReading from ./infile.txt: 505 MB\nWrote to ./outfile.txt: 250 MB\nElapsed time: 0:02:53.062531\n\n```\n\nEnjoy!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finitstring%2Fpassphrase-wordlist","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finitstring%2Fpassphrase-wordlist","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finitstring%2Fpassphrase-wordlist/lists"}