Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/4ai/agn
Official Code for Merging Statistical Feature via Adaptive Gate for Improved Text Classification (AAAI2021)
https://github.com/4ai/agn
bert deep-learning nlp text-classification
Last synced: 6 days ago
JSON representation
Official Code for Merging Statistical Feature via Adaptive Gate for Improved Text Classification (AAAI2021)
- Host: GitHub
- URL: https://github.com/4ai/agn
- Owner: 4AI
- License: mit
- Created: 2020-12-05T02:42:10.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-02-05T05:36:27.000Z (almost 3 years ago)
- Last Synced: 2023-10-20T15:48:36.424Z (about 1 year ago)
- Topics: bert, deep-learning, nlp, text-classification
- Language: Python
- Homepage:
- Size: 55.7 KB
- Stars: 22
- Watchers: 3
- Forks: 5
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AGN
Official Code for [Merging Statistical Feature via Adaptive Gate for Improved Text Classification](https://ojs.aaai.org/index.php/AAAI/article/view/17569) (AAAI2021)## Prepare Data
### Dataset
| Dataset | URL |
|:------------:|:----------------------------------------------------------------:|
| Subj | http://www.cs.cornell.edu/people/pabo/movie-review-data/ |
| SST-1/2 | http://nlp.stanford.edu/sentiment/ |
| TREC | https://trec.nist.gov/data.html |
| AG's News | http://groups.di.unipi.it/~gulli/AG_corpus_of_news_articles |
| Yelp P. / F. | https://www.yelp.com/dataset/ |You first need to download datasets from official sites. Then convert the data into `JSONL` style, as follows:
```json
{"label": "0", "text": "hoffman waits too long to turn his movie in an unexpected direction , and even then his tone retains a genteel , prep-school quality that feels dusty and leatherbound ."}
{"label": "1", "text": "if you 're not deeply touched by this movie , check your pulse ."}
```Each line is a JSON object, in which two fields `text` and `label` are required.
### Pretrained Language Model
We apply the pretrained `Uncased-Bert-Base` model in this paper, you can download it by [this url](https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_A-12.zip) directly.
## Setup Environment
We recommend you create a virtual environment to conduct experiments.
```bash
$ python -m venv agn
$ source agn/bin/activate
```You should install TensorFlow in terms of your environment. Note that we only test the code under `tensorflow<2.0`, greater versions may not be compatible.
Our environments:
```bash
$ pip list | egrep "tensorflow|Keras|langml"
Keras 2.3.1
langml 0.1.0
tensorflow 1.15.0
```Next, You should install other python dependencies.
```bash
$ python -m pip install -r requirements.txt
```## Train & Evaluate
You should first prepare a configure file to set data paths and hyperparameters.
for example:
`sst2.json`
```json
{
"max_len": 80,
"ae_epochs": 100,
"epochs": 10,
"batch_size": 32,
"learning_rate": 0.00003,
"pretrained_model_dir": "/path/to/pretrained-bert/uncased-bert-base",
"train_path": "/path/to/SST-2/train.jsonl",
"dev_path": "/path/to/SST-2/test.jsonl",
"save_dir": "/dir/to/save",
"epsilon": 0.05,
"dropout": 0.3,
"fgm_epsilon": 0.3,
"iterations": 1,
"verbose": 1
}
```| Parameter | Description |
|-----------------------|--------------------------------------------------|
| max_len | max length of input sequence |
| ae_epochs | epochs to train AutoEncoder |
| epochs | epochs to train classifier |
| batch_size | batch size |
| learning_rate | learning rate |
| pretrained_model_dir | file directory of the pre-trained language model |
| save_dir | dir to save model |
| train_path | data path of train set |
| dev_path | data path of develop set / test set |
| epsilon | epsilon size of valve |
| apply_fgm | whether to apply fgm attack, default tru |
| fgm_epsilon | epsilon of fgm, default 0.2 |Then you can start to train and evaluate by following shell script.
```bash
export TF_KERAS=1; CUDA_VISIBLE_DEVICES=0 python main.py /path/to/config.json
```please set `TF_KERAS=1` to use AdamW.
After training is done, models will be stored in the specified `save_dir` folder.
## Visualize Attention
To visualize attention, you should train a model first following the above instruction, then run `visualize_attn.py` as follows:
```bash
export TF_KERAS=1; CUDA_VISIBLE_DEVICES=0 python visualize_attn.py /path/to/your_config.json
```After inputting the text to the prompt box, the code will analyze the text and save the attention figure to `attn_visualize.png`.
Note that, in previous settings, we pick up a most distinguished feature dimension from 2D attention and visualize the selected feature (1D) attention. In the latest version, we visualize the whole 2D attention rather than 1D attention.