https://github.com/cltk/old-norse-lemmatizer

inflection lemmatizer nlp old-norse

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/cltk/old-norse-lemmatizer
Owner: cltk
Created: 2019-02-18T10:07:33.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-03-10T23:11:49.000Z (over 6 years ago)
Last Synced: 2025-02-17T08:41:42.532Z (8 months ago)
Topics: inflection, lemmatizer, nlp, old-norse
Language: Jupyter Notebook
Size: 49.8 KB
Stars: 2
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Old Norse lemmatizer

The main aim of this repository is to generate all the forms that Old Norse words can have, given the Zoëga's
dictionary and the Old Norse inflection rules.

## TODOs

* [x] Retrieve lemmata of the [Zoëga's dictionary](https://github.com/cltk/old_norse_dictionary_zoega).
* [ ] For each lemma, extract the word category (noun, verb, pronoun, article, number, etc). One lemma can represent several word categories.
* [ ] For a given lemma in a certain word category, extract the precise subcategory: a verbs may be strong, weak or preterite-present, nouns can be strong or weak, masculine, feminine or neuter, etc.
* [ ] For a given lemma with its precise word category, generate all the possible forms that a lemma can have thanks to [Old Norse lemma inflection module of CLTK](https://github.com/cltk/cltk/tree/master/cltk/inflection/old_norse).
* [ ] Compute frequencies of each word forms in Old Norse corpora.
* [ ] Make lemmatizers for Old Norse.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cltk/old-norse-lemmatizer

Awesome Lists containing this project

README