Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/stefanhaustein/nlp
Processing Whitaker's Words with Java
https://github.com/stefanhaustein/nlp
Last synced: 22 days ago
JSON representation
Processing Whitaker's Words with Java
- Host: GitHub
- URL: https://github.com/stefanhaustein/nlp
- Owner: stefanhaustein
- License: apache-2.0
- Created: 2019-02-17T23:39:23.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-02-06T10:13:06.000Z (over 2 years ago)
- Last Synced: 2023-07-31T13:09:41.970Z (over 1 year ago)
- Language: Java
- Size: 2.92 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Processing Whitaker's Words with Java
## Deviations from the original
- The dictionary used is based on the generated human-readable word list found on
http://archives.nd.edu/whitaker/dictpage.htm. It has been processed further to reduce
redundancies and to improve human-readability while keeping the file easy to parse.- The processed file is named [whitaker_converted.txt](https://raw.githubusercontent.com/stefanhaustein/nlp/master/src/org/kobjects/nlp/latin/whitaker_converted.txt) and is contained in the package
[org.kobjects.nlp.latin](https://github.com/stefanhaustein/nlp/tree/master/src/org/kobjects/nlp/latin).
- The corresponding processing code and original input can be found in the package
org.kobjects.nlp.latin.whitaker.
- Inflections are part of the code, see [Delcinator.java](https://github.com/stefanhaustein/nlp/blob/master/src/org/kobjects/nlp/latin/Declinator.java) and Conjugator.java in org.kobjects.nlp.latin
- The program currently just generates all word forms in memory when loading the dictionary.
## Words PreservationFor a maintained version William Whitaker's original WORDS programm,
please refer to https://github.com/mk270/whitakers-words