https://github.com/akb89/ortografix

Seq2seq model with attention for automatic orthographic simplification
https://github.com/akb89/ortografix

Last synced: 6 months ago
JSON representation

Seq2seq model with attention for automatic orthographic simplification

Host: GitHub
URL: https://github.com/akb89/ortografix
Owner: akb89
License: mit
Created: 2020-04-04T10:09:58.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-07-25T11:01:40.000Z (about 2 years ago)
Last Synced: 2025-09-28T23:14:52.670Z (10 months ago)
Language: Python
Size: 113 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # ortografix

[![GitHub release][release-image]][release-url]

[![PyPI release][pypi-image]][pypi-url]

[![Build][build-image]][build-url]

[![MIT License][license-image]][license-url]

[release-image]:https://img.shields.io/github/release/akb89/ortografix.svg?style=flat-square

[release-url]:https://github.com/akb89/ortografix/releases/latest

[pypi-image]:https://img.shields.io/pypi/v/ortografix.svg?style=flat-square

[pypi-url]:https://pypi.org/project/ortografix/

[build-image]:https://img.shields.io/github/workflow/status/akb89/ortografix/CI?style=flat-square

[build-url]:https://github.com/akb89/ortografix/actions?query=workflow%3ACI

[license-image]:http://img.shields.io/badge/license-MIT-000000.svg?style=flat-square

[license-url]:LICENSE.txt

Welcome to ortografix, a seq2seq model for automatic ortografic simplification, coded with pytorch 1.4.

## Install

via pip:

```shell

pip3 install ortografix

```

or, after a git clone:

```shell

python3 setup.py install

```

## Train

To train a model, run:

```shell

ortografix train \

--data /abs/path/to/training/data \

--model-type gru \

--shuffle \

--hidden-size 256 \

--num-layers 1 \

--bias \

--dropout 0 \

--learning-rate 0.01 \

--epochs 10 \

--print-every 100 \

--use-teacher-forcing \

--teacher-forcing-ratio 0.5 \

--output-dirpath /abs/path/to/output/directory/whereto/save/model \

--with-attention \

--character-based

```

## Test

### Qualitative evaluation

To qualitatively evaluate the output of the model on a set of 10 randomly selected sentences from a given dev/test set, run:

```shell

ortografix evaluate \

--data /abs/path/to/test/data.txt \

--model /abs/path/to/model/directory/ \

--random 10

```

### Quantitative evaluation

To quantitatively evaluate the output of the model on a given dev/test set, run:

```shell

ortografix evaluate \

--data /abs/path/to/test/data.txt \

--model /abs/path/to/model/directory

```

Quantitative evaluation will return:

1. The sum of all edit (Levenshtein) distance computed across all test pairs

2. The average edit distance computed across all test pairs

3. The average normalized edit distance

4. The average normalized edit similarity

All measure are computed via [textdistance](https://github.com/life4/textdistance).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/akb89/ortografix

Awesome Lists containing this project

README