https://github.com/mevdschee/spelwijze-generator
Generator for the spelling game that is published on Dutch newspaper websites
https://github.com/mevdschee/spelwijze-generator
command-line-tool game-generator spelling spelling-game word-game
Last synced: 10 months ago
JSON representation
Generator for the spelling game that is published on Dutch newspaper websites
- Host: GitHub
- URL: https://github.com/mevdschee/spelwijze-generator
- Owner: mevdschee
- License: mit
- Created: 2024-09-09T21:54:36.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-16T01:12:16.000Z (over 1 year ago)
- Last Synced: 2025-04-26T15:02:39.826Z (about 1 year ago)
- Topics: command-line-tool, game-generator, spelling, spelling-game, word-game
- Language: Go
- Homepage:
- Size: 4.52 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Spelwijze generator
"Spelwijze" is spelling game that is published on Dutch newspaper websites. You get 1 mandatory letter and 6 optional letters. The goal is to make as many 4 or more letter words with these 7 letters containing at least the mandatory letter once. This repository contains a tool to generate these puzzles.
### Sources
Download Dutch words from:
https://www.opentaal.org/bestanden/file/2-woordenlijst-v-2-10g-bronbestanden
To filter out non-letter characters (go from 164313 to 156280 words) execute:
cat 'OpenTaal-210G-basis-gekeurd.txt' | grep -vP '[^a-z]' | sort | uniq | gzip > words.txt.gz
Download Dutch word freqencies from:
https://wortschatz.uni-leipzig.de/en/download/Dutch
To filter the word frequency list (go from 1000000 to 515630 words) execute:
cat 'nld_mixed_2012_1M-words.txt' | cut -f 2,3 | tr A-Z a-z | grep -P '^[a-z]+\t' | gzip > wordfreq.txt.gz
The text files are gzipped to reduce space.
Optional: Download another great list from:
https://kaikki.org/dictionary/Dutch/words/index.html
To filter the verbs:
cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "verb"' | grep -o '"head_templates": \[{"name": "nl-verb", "args": {}, "expansion": "[a-z]\{4,\}"' | sort | uniq | cut -d\" -f 12 | gzip > verbs.txt.gz
To filter the nouns:
cat kaikki.org-dictionary-Dutch.jsonl | grep '"pos": "noun"' | grep -v '"plural"' | grep -o '"word": "[a-z]\+"' | cut -d: -f2 | cut -d\" -f 2 | sort | uniq | gzip > nouns.txt.gz
To add and combine these:
mv words.txt.gz words1.txt.gz
zcat words1.txt.gz verbs.txt.gz nouns.txt.gz | sort | uniq | gzip > words.txt.gz
rm words1.txt.gz verbs.txt.gz nouns.txt.gz
Now the extra words are added.
### Running
Now run pick a length for your seeding word (a word with 7 different letters):
go run . 16
Showing all 16 letter words consisting of exactly 7 different letters:
begijnenbeweging
binnenduingebied
bloembollenteelt
concernonderdeel
engineeringgroep
espressoapparaat
exercitieterrein
geestesgestoorde
herinterpreteren
intentionaliteit
...(23 more)...
Now pick a seeding word and run:
go run . bloembollenteelt
Resulting in 7 different 7 letter combinations (points based on word frequency):
tbelmno: 728
eblmnot: 725
nbelmot: 711
obelmnt: 534
belmnot: 194
mbelnot: 164
lbemnot: 119
Now if we chose "mbelnot" (where "m" is the mandatory letter) we can run:
go run . mbelnot
To find all 110 words with minimum length 4 that contain the letter "m" and one or more of the other 6 letters:
beetnemen
bemeten
benemen
benoemen
betomen
betonelement
betonmolen
bloem
bloembol
bloembollenteelt
...(100 more)...
Enjoy!