https://github.com/cltk/gmh_models_cltk
Stored data for tagging Middle High German
https://github.com/cltk/gmh_models_cltk
cltk lemmatizer middle-high-german pos-tagger
Last synced: about 1 year ago
JSON representation
Stored data for tagging Middle High German
- Host: GitHub
- URL: https://github.com/cltk/gmh_models_cltk
- Owner: cltk
- Created: 2019-10-09T18:56:35.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-05-06T12:53:41.000Z (about 6 years ago)
- Last Synced: 2025-02-17T08:41:42.834Z (over 1 year ago)
- Topics: cltk, lemmatizer, middle-high-german, pos-tagger
- Language: Python
- Homepage:
- Size: 19.9 MB
- Stars: 1
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# gmh_models_cltk
Pre-compiled taggers and tokenizers made with NLTK + CLTK data.
## Origin of data
Data come from https://www.linguistics.rub.de/rem/access/index.html, are availabe under [CC BY-SA 4.0 license](https://creativecommons.org/licenses/by-sa/4.0/).
> Klein, Thomas; Wegera, Klaus-Peter; Dipper, Stefanie; Wich-Reif, Claudia (2016). Referenzkorpus Mittelhochdeutsch (1050–1350), Version 1.0, https://www.linguistics.ruhr-uni-bochum.de/rem/. [ISLRN 332-536-136-099-5](http://islrn.org/resources/332-536-136-099-5/).
## Lemmata
## Named entity recognition
## Taggers
### Parts of speech taggers
## Tokenizers