Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/inanyan/eng-ua-translator1
Rule based English to Ukrainian translator
https://github.com/inanyan/eng-ua-translator1
english lisp nlp racket translator ukrainian
Last synced: about 2 months ago
JSON representation
Rule based English to Ukrainian translator
- Host: GitHub
- URL: https://github.com/inanyan/eng-ua-translator1
- Owner: InAnYan
- Created: 2023-10-04T09:04:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-18T08:42:34.000Z (about 1 year ago)
- Last Synced: 2023-11-18T09:28:40.764Z (about 1 year ago)
- Topics: english, lisp, nlp, racket, translator, ukrainian
- Language: Racket
- Homepage:
- Size: 41 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Rule based English to Ukrainian translator
Very simple translator. Barely handles Present Simple.
Currently it doesn't work because of unimplemented dictionary interface.
The declension part is also undone.Thinking about writing it in Python and making it much simpler.
## Dependencies
- `amb-parser` (my package, actually).## Algorithm:
1. Parse English sentence into parse tree (`parsing.rkt`, `english-grammar.rkt`, `dictionary.rkt`):
1. Downcase string.
2. Remove punctuation characters.
3. Split string into words by whitespace.
4. Tag POS for words.
5. Parse the sentence. (`amb-parser` handles ambiguity, so it returns all possible parse trees).
2. Transform English parse tree into Ukrainian parse tree.That's the heart of the translator. It uses rules and Racket's `match` form to transform English grammar patterns to Ukrainian grammar patterns.
The only thing it handles are:
- Removing `am`, `is`, `are` from simple sentences (like: `My name is Anton.`).
- Removing determiners `a`, `an`, `the`.
- Transforming sentences with `have`/`has` (like: `I have a pen.` -> `В я є ручка.`).
4. Decline Ukrainian parse tree and turn it into string.It should be an interesing part of the program. I've tried to write declension functions some time ago, but failed.