https://github.com/rz-zhang/SeqMix

The repository for our EMNLP'20 paper SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup.
https://github.com/rz-zhang/SeqMix

Last synced: 2 months ago
JSON representation

The repository for our EMNLP'20 paper SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup.

Host: GitHub
URL: https://github.com/rz-zhang/SeqMix
Owner: rz-zhang
Created: 2020-09-28T09:42:26.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2021-09-05T07:20:20.000Z (almost 4 years ago)
Last Synced: 2024-11-16T07:33:31.371Z (8 months ago)
Language: Python
Size: 1.85 MB
Stars: 43
Watchers: 3
Forks: 6
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

StarryDivineSky - rz-zhang/SeqMix
Awesome-Mixup - [Code

README

# SeqMix
The repository of our EMNLP'20 paper
**SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup**
[[paper]](https://rongzhizhang.org/pdf/emnlp20_SeqMix.pdf) [[slides]](https://rongzhizhang.org/slides/EMNLP20_SeqMix_Slides.pdf)

![Illustration of the three variants of SeqMix](SeqMix.png)

# Requirements
- pytorch-transformers==1.2.0
- torch==1.2.0
- seqeval==0.0.5
- tqdm==4.31.1
- nltk==3.4.5
- Flask==1.1.1
- Flask-Cors==3.0.8
- pytorch_pretrained_bert==0.6.2

Install the required packages:
```
pip install -r requirements.txt
```

# Key Parameters
- `data_dir`: specify the data file, we provide CoNLL-03 dataset here
- `max_seq_length`: maximum length of each sequence
- `num_train_epochs`: number of training epochs
- `train_batch_size`: batch size during model training
- `active_policy`: query policy of active learning
- `augment_method`: augmenting method
- `augment_rate`: augmenting rate
- `hyper_alpha`: parameter of Beta distribution

# Run
## Active learning part
Random Sampling
```
python active_learn.py --active_policy=random
```
Least Confidence Sampling
```
python active_learn.py --active_policy=lc
```
Normalized Token Entropy sampling
```
python active_learn.py --active_policy=nte
```

## Seqmix part
Whole sequence mixup
```
python active_learn.py --augment_method=soft
```
Sub-sequence mixup
```
python active_learn.py --augment_method=slack
```
Label-constrained sub-sequence mixup
```
python active_learn.py --augment_method=lf
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rz-zhang/SeqMix

Awesome Lists containing this project

README