https://github.com/cltk/gml_models_cltk
https://github.com/cltk/gml_models_cltk
Last synced: 4 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/cltk/gml_models_cltk
- Owner: cltk
- License: mit
- Created: 2018-06-09T20:54:07.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-06-12T09:29:56.000Z (about 8 years ago)
- Last Synced: 2025-10-28T02:44:29.725Z (8 months ago)
- Size: 637 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Middle Low German Models
Trained POS tagger for MLG.
Training Set and Citations
==========================
>ReN-Team. 2018. “Referenzkorpus Mittelniederdeutsch/Niederrheinisch (1200-1650).” Archived in Hamburger Zentrum für Sprachkorpora. Version 0.6. Publication date 2018-03-07. http://hdl.handle.net/11022/0000-0007-C64C-5.
>Ingrid Schröder. 2014. “Das Referenzkorpus: Neue Perspektiven für die mittelniederdeutsche Grammatikographie.” Jahrbuch für Germanistische Sprachgeschichte, 5(1), pp. 150-164. http://www.degruyter.com/view/j/jbgsg.2014.5.issue-1/jbgsg-2014-0011/jbgsg-2014-0011.xml .
>Robert Peters, Norbert Nagel. 2014. “Das digitale ‚Referenzkorpus Mittelniederdeutsch / Niederrheinisch (ReN)‘.” Jahrbuch für Germanistische Sprachgeschichte, 5(1), pp. 165-175. https://www.degruyter.com/view/j/jbgsg.2014.5.issue-1/jbgsg-2014-0012/jbgsg-2014-0012.xml .
Training the Tagger
===================
``` python
>>> from nltk import DefaultTagger, UnigramTagger, BigramTagger
>>> t0 = DefaultTagger('NA')
>>> t1 = UnigramTagger([tags], backoff=t0)
>>> t2 = BigramTagger([tags], backoff=t1)
```
Saving the Tagger
=================
``` python
>>> ffile = open("backoff_tagger.pickle", 'wb')
>>> pickle.dump(t2, ffile)
>>> f.close()
```
Loading the Tagger
==================
``` python
>>> ffile = open('backoff_tagger', 'rb')
>>> tagger = pickle.load(ffile)
>>> ffile.close()
```