https://github.com/cltk/middle_high_german_texts
https://github.com/cltk/middle_high_german_texts
middle-high-german mittelhochdeutsch
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/cltk/middle_high_german_texts
- Owner: cltk
- Created: 2019-09-18T21:31:52.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-09-23T22:13:53.000Z (almost 7 years ago)
- Last Synced: 2025-02-17T08:41:42.284Z (over 1 year ago)
- Topics: middle-high-german, mittelhochdeutsch
- Language: Python
- Size: 1.05 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Referenzkorpus Mittelhochdeutsch
## Source and license
[Main page of the project](https://www.linguistics.rub.de/rem/access/index.html)
License:
> Das Referenzkorpus Mittelhochdeutsch ist lizenziert unter einer [Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International Lizenz](https://creativecommons.org/licenses/by-sa/4.0/).
No change is made on the corpus. This code is intended to parse the corpus.
## Corpus retrieval
1. Go to https://www.linguistics.rub.de/rem/access/index.html.
2. Click on "CORA-XML AKS .TAR.XZ" or "CORA-XML ALS .ZIP"
3. Click on "Herunterladen".
4. Uncompress the dowloaded file.
5. You have a folder, named **rem-corraled-20161222** (2019-09-18) with a list of XML files which are annotated texts.
## Code
The available code will parse individual XML files.