Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ynsrc/german-categorized-wordlist

German Categorized Wordlist Project
https://github.com/ynsrc/german-categorized-wordlist

dataset dictionary german german-language german-nlp linguistics nlp wordlist wordlists

Last synced: 7 days ago
JSON representation

German Categorized Wordlist Project

Awesome Lists containing this project

README

        

# German Categorized Wordlist

This project contains categorized German words in seperated text files.

Experimental version [v1](v1) is ready now, but excluded [gennum](tools/gennum) outputs
due to huge size. If you need them, then you can generate numbers in text with our
[gennum tool](tools/gennum) from 0 to 999999.

You can download wordlists from assets in
[latest release](https://github.com/ynsrc/german-categorized-wordlist/releases/latest).

# Notes
* Lines are sorted and unique in final output files.
* Files are categorized by word types in English.
* Word lists may contain words that are incorrect, miscategorized, or meaningles.

# Using Tools
Tools are located in [tools](tools) folder in this repository.

# Sources
Sources are located in [sources](sources) folder in this repository.

# Attributions
* https://danielnaber.de/morphologie/
* https://de.wiktionary.org
* https://dumps.wikimedia.org/mirrors.html
* https://en.wiktionary.org
* https://en.wiktionary.org/wiki/Category:German_lemmas
* https://extensions.libreoffice.org/en/extensions/show/german-de-de-frami-dictionaries
* https://gist.github.com/MarvinJWendt/2f4f4154b8ae218600eb091a5706b5f4
* https://github.com/adbar/German-NLP
* https://github.com/languagetool-org/german-pos-dict
* https://github.com/michmech/lemmatization-lists
* https://www-user.tu-chemnitz.de/~fri/ding/
* https://www.dwds.de/lemma/list
* https://www.koeblergerhard.de/publikat.html
* https://www.openthesaurus.de/about/download

# License
[German Categorized Wordlist](https://github.com/ynsrc/german-categorized-wordlist)
by [YNSRC](https://github.com/ynsrc) is licensed under
[Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0)

Feel free to use our wordlists in your personal, open-source or even commercial projects with
attrubition. But take care conditions from license of owners if you want to use directly
third-party data sources that we are processing to generating wordlists.