https://github.com/HKUST-KnowComp/R-Net

Tensorflow Implementation of R-Net
https://github.com/HKUST-KnowComp/R-Net

machine-comprehension nlp r-net squad tensorflow

Last synced: 6 months ago
JSON representation

Tensorflow Implementation of R-Net

Host: GitHub
URL: https://github.com/HKUST-KnowComp/R-Net
Owner: HKUST-KnowComp
License: mit
Created: 2017-11-28T02:12:47.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2018-08-08T18:04:09.000Z (over 7 years ago)
Last Synced: 2024-11-27T03:34:39.873Z (about 1 year ago)
Topics: machine-comprehension, nlp, r-net, squad, tensorflow
Language: Python
Homepage:
Size: 188 KB
Stars: 578
Watchers: 34
Forks: 210
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-qa - R-Net - An end-to-end neural networks model for reading comprehension style question answering, which aims to answer questions from a given passage. (Codes / Most QA systems have roughly 3 parts)

README

          # R-Net

  * A Tensorflow implementation of [R-NET: MACHINE READING COMPREHENSION WITH SELF-MATCHING NETWORKS](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/05/r-net.pdf). This project is specially designed for the [SQuAD](https://arxiv.org/pdf/1606.05250.pdf) dataset.

  * Should you have any question, please contact Wenxuan Zhou (wzhouad@connect.ust.hk).

## Requirements

There have been a lot of known problems caused by using different software versions. Please check your versions before opening issues or emailing me.

#### General

  * Python >= 3.4

  * unzip, wget

#### Python Packages

  * tensorflow-gpu >= 1.5.0

  * spaCy >= 2.0.0

  * tqdm

  * ujson

## Usage

To download and preprocess the data, run

```bash

# download SQuAD and Glove

sh download.sh

# preprocess the data

python config.py --mode prepro

```

Hyper parameters are stored in config.py. To debug/train/test the model, run

```bash

python config.py --mode debug/train/test

```

To get the official score, run

```bash

python evaluate-v1.1.py ~/data/squad/dev-v1.1.json log/answer/answer.json

```

The default directory for tensorboard log file is `log/event`

See release for trained model.

## Detailed Implementaion

  * The original paper uses additive attention, which consumes lots of memory. This project adopts scaled multiplicative attention presented in [Attention Is All You Need](https://arxiv.org/abs/1706.03762).

  * This project adopts variational dropout presented in [A Theoretically Grounded Application of Dropout in Recurrent Neural Networks](https://arxiv.org/abs/1512.05287).

  * To solve the degradation problem in stacked RNN, outputs of each layer are concatenated to produce the final output.

  * When the loss on dev set increases in a certain period, the learning rate is halved.

  * During prediction, the project adopts search method presented in [Machine Comprehension Using Match-LSTM and Answer Pointer](https://arxiv.org/abs/1608.07905).

  * To address efficiency issue, this implementation uses bucketing method (contributed by xiongyifan) and CudnnGRU. The bucketing method can speedup training, but will lower the F1 score by 0.3%.

## Performance

#### Score

||EM|F1|

|---|---|---|

|original paper|71.1|79.5|

|this project|71.07|79.51|





#### Training Time (s/it)

||Native|Native + Bucket|Cudnn|Cudnn + Bucket|

|---|---|---|---|---|

|E5-2640|6.21|3.56|-|-|

|TITAN X|2.56|1.31|0.41|0.28|

## Extensions

These settings may increase the score but not used in the model by default. You can turn these settings on in `config.py`. 

 * [Pretrained GloVe character embedding](https://github.com/minimaxir/char-embeddings). Contributed by yanghanxy.

 * [Fasttext Embedding](https://fasttext.cc/docs/en/english-vectors.html). Contributed by xiongyifan. May increase the F1 by 1% (reported by xiongyifan).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/HKUST-KnowComp/R-Net

Awesome Lists containing this project

README