Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mattbui/sent-comp
Sentence Compression with deletion, accepted at ICCCI
https://github.com/mattbui/sent-comp
Last synced: about 1 month ago
JSON representation
Sentence Compression with deletion, accepted at ICCCI
- Host: GitHub
- URL: https://github.com/mattbui/sent-comp
- Owner: mattbui
- Created: 2021-01-04T12:48:40.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2021-01-04T13:49:36.000Z (about 4 years ago)
- Last Synced: 2024-10-30T02:54:31.816Z (2 months ago)
- Language: Python
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ICCCI - Sentence Compression with deletion
## Dataset
The dataset is available at: [https://github.com/google-research-datasets/sentence-compression](https://github.com/google-research-datasets/sentence-compression). Download and store the `*.gz` files in `data/` directory.
## Requirements
This project requires python3.6+ and pytorch1.1+. It used the models and embeddings from [FLAIR framework](https://github.com/flairNLP/flairhttps://github.com/flairNLP/flair):
```bash
pip install flair
```## Preprocess data
In order to train a sequence tagging model, the original data need to be align into sequence tagging format. To align the downloaded data:
```bash
export PRJ_HOME=
bash $PRJ_HOME/runs/preprocess.sh
```## Training
Different training configs for each settings are available in `runs/`. To start training:
```bash
export PRJ_HOME=
bash $PRJ_HOME/runs/train_.sh
```