https://github.com/doxakis/lemmatisationdemo
Lemmatisation demo (normalize words for machine learning)
https://github.com/doxakis/lemmatisationdemo
Last synced: 10 months ago
JSON representation
Lemmatisation demo (normalize words for machine learning)
- Host: GitHub
- URL: https://github.com/doxakis/lemmatisationdemo
- Owner: doxakis
- License: mit
- Created: 2017-07-27T04:16:56.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-07-27T04:31:41.000Z (almost 9 years ago)
- Last Synced: 2025-07-02T03:37:40.135Z (12 months ago)
- Language: C#
- Size: 37.2 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Lemmatisation demo
I am using the lemmatisation library from: http://lemmatise.ijs.si/
## Purpose
- Help normalize words for machine learning.
## Wikipedia:
> Lemmatisation (or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form.
> In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends on correctly identifying the intended part of speech and meaning of a word in a sentence, as well as within the larger context surrounding that sentence, such as neighboring sentences or even an entire document. As a result, developing efficient lemmatisation algorithms is an open area of research.
https://en.wikipedia.org/wiki/Lemmatisation
## Preview
## Copyright and license
Code released under the MIT license.