An open API service indexing awesome lists of open source software.

https://github.com/cltk/gmh_models_cltk

Stored data for tagging Middle High German
https://github.com/cltk/gmh_models_cltk

cltk lemmatizer middle-high-german pos-tagger

Last synced: about 1 year ago
JSON representation

Stored data for tagging Middle High German

Awesome Lists containing this project

README

          

# gmh_models_cltk

Pre-compiled taggers and tokenizers made with NLTK + CLTK data.

## Origin of data

Data come from https://www.linguistics.rub.de/rem/access/index.html, are availabe under [CC BY-SA 4.0 license](https://creativecommons.org/licenses/by-sa/4.0/).

> Klein, Thomas; Wegera, Klaus-Peter; Dipper, Stefanie; Wich-Reif, Claudia (2016). Referenzkorpus Mittelhochdeutsch (1050–1350), Version 1.0, https://www.linguistics.ruhr-uni-bochum.de/rem/. [ISLRN 332-536-136-099-5](http://islrn.org/resources/332-536-136-099-5/).

## Lemmata

## Named entity recognition

## Taggers

### Parts of speech taggers

## Tokenizers