Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/satabin/lingua
Lexical and grammatical tools for natural languages
https://github.com/satabin/lingua
Last synced: 11 days ago
JSON representation
Lexical and grammatical tools for natural languages
- Host: GitHub
- URL: https://github.com/satabin/lingua
- Owner: satabin
- License: apache-2.0
- Created: 2015-05-10T21:10:02.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2018-03-14T14:36:28.000Z (almost 7 years ago)
- Last Synced: 2023-07-06T21:42:02.180Z (over 1 year ago)
- Language: Scala
- Homepage:
- Size: 649 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.markdown
- License: LICENSE
Awesome Lists containing this project
README
Lingua [![Codacy Badge](https://api.codacy.com/project/badge/Grade/cab189517d35400a848cc44348b1757b)](https://www.codacy.com/app/satabin/lingua?utm_source=github.com&utm_medium=referral&utm_content=satabin/lingua&utm_campaign=Badge_Grade)
======Lingua is a set of linguistic tools written in Scala. It is divided in several modules.
Module `lexikon`
----------------The `lexikon` module provides tools to generate morphological lexica out of a dedicated description language. For example dictionaries, see the [resources directory](https://github.com/satabin/lingua/tree/master/lexikon/src/main/resources) and have a look at `.dico` files.
To run the lexicon genrator:
```sh
$ sbt
> project lexikon
> runMain lingua.lexikon.DikoMain compile lexikon/src/main/resources/français.dico -N /tmp/nfst.dot -F /tmp/fst.dot
```Then you can render the generated (N)Fst by using graphviz tools.
This command produces a compiled version of the dictionary in a file named `dikput.diko`. This compiled version can be queried as follows (from the same sbt session)
```sh
> runMain lingua.lexikon.DikoMain query dikoput.diko -q mange
```Which will return
```scala
Set(DikoEntry(manger,Set(+Sg, @V, +3ème, +G1, +Prés, +Ind)), DikoEntry(manger,Set(+Sg, +1ère, @V, +G1, +Prés, +Ind)))
```This means that according to this dictionary, `mange` stems to `manger` which is a verb (`@V` category) conjugated at the first person singular of the indicative present, or at the third person of the indicative present.
For more details on available options, run this main class with option `-h`
Module `fst`
------------This module is a generic [Fst](https://en.wikipedia.org/wiki/Finite_state_transducer) module that can be used independently.