https://github.com/akb89/ortografix
Seq2seq model with attention for automatic orthographic simplification
https://github.com/akb89/ortografix
Last synced: 4 months ago
JSON representation
Seq2seq model with attention for automatic orthographic simplification
- Host: GitHub
- URL: https://github.com/akb89/ortografix
- Owner: akb89
- License: mit
- Created: 2020-04-04T10:09:58.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-07-25T11:01:40.000Z (almost 2 years ago)
- Last Synced: 2025-09-28T23:14:52.670Z (8 months ago)
- Language: Python
- Size: 113 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ortografix
[![GitHub release][release-image]][release-url]
[![PyPI release][pypi-image]][pypi-url]
[![Build][build-image]][build-url]
[![MIT License][license-image]][license-url]
[release-image]:https://img.shields.io/github/release/akb89/ortografix.svg?style=flat-square
[release-url]:https://github.com/akb89/ortografix/releases/latest
[pypi-image]:https://img.shields.io/pypi/v/ortografix.svg?style=flat-square
[pypi-url]:https://pypi.org/project/ortografix/
[build-image]:https://img.shields.io/github/workflow/status/akb89/ortografix/CI?style=flat-square
[build-url]:https://github.com/akb89/ortografix/actions?query=workflow%3ACI
[license-image]:http://img.shields.io/badge/license-MIT-000000.svg?style=flat-square
[license-url]:LICENSE.txt
Welcome to ortografix, a seq2seq model for automatic ortografic simplification, coded with pytorch 1.4.
## Install
via pip:
```shell
pip3 install ortografix
```
or, after a git clone:
```shell
python3 setup.py install
```
## Train
To train a model, run:
```shell
ortografix train \
--data /abs/path/to/training/data \
--model-type gru \
--shuffle \
--hidden-size 256 \
--num-layers 1 \
--bias \
--dropout 0 \
--learning-rate 0.01 \
--epochs 10 \
--print-every 100 \
--use-teacher-forcing \
--teacher-forcing-ratio 0.5 \
--output-dirpath /abs/path/to/output/directory/whereto/save/model \
--with-attention \
--character-based
```
## Test
### Qualitative evaluation
To qualitatively evaluate the output of the model on a set of 10 randomly selected sentences from a given dev/test set, run:
```shell
ortografix evaluate \
--data /abs/path/to/test/data.txt \
--model /abs/path/to/model/directory/ \
--random 10
```
### Quantitative evaluation
To quantitatively evaluate the output of the model on a given dev/test set, run:
```shell
ortografix evaluate \
--data /abs/path/to/test/data.txt \
--model /abs/path/to/model/directory
```
Quantitative evaluation will return:
1. The sum of all edit (Levenshtein) distance computed across all test pairs
2. The average edit distance computed across all test pairs
3. The average normalized edit distance
4. The average normalized edit similarity
All measure are computed via [textdistance](https://github.com/life4/textdistance).