Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nusnlp/ugec
The official code for the "Unsupervised Grammatical Error Correction Rivaling Supervised Methods" paper, published in EMNLP 2023.
https://github.com/nusnlp/ugec
Last synced: 17 days ago
JSON representation
The official code for the "Unsupervised Grammatical Error Correction Rivaling Supervised Methods" paper, published in EMNLP 2023.
- Host: GitHub
- URL: https://github.com/nusnlp/ugec
- Owner: nusnlp
- Created: 2023-10-10T12:39:25.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-08T03:09:01.000Z (about 2 months ago)
- Last Synced: 2024-12-08T04:17:41.785Z (about 2 months ago)
- Language: Python
- Homepage: https://aclanthology.org/2023.emnlp-main.185.pdf
- Size: 5.18 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Unsupervised Grammatical Error Correction Rivaling Supervised Methods
> Hannan Cao, Liping Yuan, Yuchen Zhang, Hwee Tou Ng. [Unsupervised Grammatical Error Correction Rivaling Supervised Methods](https://aclanthology.org/2023.emnlp-main.185.pdf). In EMNLP 2023.
## Training Data & Checkpoints
[GEC training data](https://drive.google.com/drive/folders/1c1xNjD7ORGaY9P3vuy1w-q_f60WkGC_D?usp=sharing);
[GEC model checkpoints](https://drive.google.com/drive/folders/1TZNbuEwjifTVqKldfpkXl264CLkHNlXw?usp=sharing);
## English GEC### Flan-T5-xxl
1. Please store all the downloaded checkpoint and data for Flan-T5-xxl in this folder: en_flan_t5/llm_finetune
2. Install the requirement.txt inside en_flan_t5 folderTrain:
```
bash train.sh
```
Inference: go to en_flan_t5/llm_inference folder
```
bash eval_gec.sh your/ckpt/name
```
### BART-base
1. Please store all the downloaded checkpoint and data for BART-base in this folder: en_fairseq_train
2. Install the requirement.txt inside en_fairseq_train folderTrain:
```
cd gec
bash train.sh path/to/the/model/to/be/restored path/to/data-bin/folder output_path
```
Inference:
```
bash new_generate.sh path/to/model/ckpt testing/input/path
```
## Chinese GEC
1. Please store all the downloaded checkpoint and data for BART-base in this folder: chinese_bart_large
2. Install the requirement.txt inside chinese_bart_large folderTrain:
```
cd gec
bash train_ch.sh
```
Inference:
```
cd gec
bash test_ch.sh
```
## CitationIf you found our paper or code useful, please cite as:
```
@inproceedings{cao-etal-2023-unsupervised,
title = "Unsupervised Grammatical Error Correction Rivaling Supervised Methods",
author = "Cao, Hannan and
Yuan, Liping and
Zhang, Yuchen and
Ng, Hwee Tou",
editor = "Bouamor, Houda and
Pino, Juan and
Bali, Kalika",
booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.emnlp-main.185",
doi = "10.18653/v1/2023.emnlp-main.185",
pages = "3072--3088",
}
```If you encounter any problem with the code, please contact [email protected] .