Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rmodi6/relation-extraction

RNN and CNN based Relation Extraction models in tensorflow 2.0
https://github.com/rmodi6/relation-extraction

cnn functional-api gru information-extraction keras natural-language-processing relation-extraction rnn tensorflow2

Last synced: about 2 months ago
JSON representation

RNN and CNN based Relation Extraction models in tensorflow 2.0

Awesome Lists containing this project

README

        

This file is best viewed in Markdown reader (eg. https://jbt.github.io/markdown-editor/)

# Overview

You will implement a bi-directional GRU as well as an original model for Relation Extraction:

The GRU is loosely based on the approach done in the work of Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification (Zhou et. al, 2016).

You will need to implement:

1. Bidirectional GRU
2. Attention layer
3. L2 Regularization

Additionally, you will design your own network architecture to solve this task. You can think of it yourself or base it on a known solution from a paper.
More details are given in the assignment pdf.

# Installation

The environment is same as past ones minus extra requirements in `requirements.txt`. But we would *strongly* encourage you to make a new environment for assignment 4.

This assignment is implemented in python 3.6 and tensorflow 2.0. Follow these steps to setup your environment:

1. [Download and install Conda](http://https://conda.io/projects/conda/en/latest/user-guide/install/index.html "Download and install Conda")
2. Create a Conda environment with Python 3.6

```
conda create -n nlp-hw4 python=3.6
```

3. Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use this code.
```
conda activate nlp-hw4
```
4. Install the requirements:
```
pip install -r requirements.txt
```

5. Download spacy model
```
python -m spacy download en_core_web_sm
```

6. Download glove wordvectors:
```
./download_glove.sh
```

**NOTE:** We will be using this environment to check your code, so please don't work in your default or any other python environment.

# Data

You are given training and validation data in the form of text files. Training can be found in `data/train.txt` and validation is in `data/val.txt`. We are using data from a previous SemEval shared task which in total had 8,000 training examples. Your train/validation examples are a 90/10 split from this original 8,000. More details of the data can be found in the overview paper SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals (Hendrickx et. al, 2009) as well as extra PDFs explaining the details of each relation in the dataset directory.

# Code Overview

This repository largely follows the same interface as assignment 2 and assignment 3. Currently, the only thing missing is tensorboard support.

## Train and Predict

You have 4 main scripts in the repository `train_basic.py`, `train_advanced.py`, `predict.py` and `evaluate.pl`.

- Train scripts do as described and saves your model to be used later for prediction. Basic training script trains the basic `MyBasicAttentiveBiGRU` model which you are supposed implement (Bi-RNN+attention). You should not need to change this script. Advanced training script on other hand, is a template/starter code which you can adapt based on your `MyAdvancedModel` architecture design.

- Predict generates predictions on the test set `test.txt` and saves your output to a file. You will submit your predictions to be scored against the hidden (labels) test set. Both files are set with reasonable defaults in their arguments but example commands are shown below.

- Evaluation script is the pearl script unlike others. You can use it see detailed report of your predictions file against the gold labels.

#### Train a model
```
python train.py --embed-file embeddings/glove.6B.100D.txt --embed-dim 100 --batch-size 10 --num_epochs 5

# stores the model by default at : serialization_dirs/basic/
```

#### Predict with model
```
python predict.py --prediction-file my_predictions.txt --batch-size 10
```

**NOTE:** These scripts will not work until you fill-up the placeholders (TODOs) left out as part of the assignment.

## Extra Scripts

As usual you are given `download_glove.sh`, however this is a slightly edited version that does not remove the larger dimension embeddings. For this assignment I personally used the 100D embeddings. I encourage everyone to do the same for their GRU to confirm your results are around the expected range.

# General results

To help everyone confirm if their model is on the right track I will say that my bi-GRU w/ attention model scored high .5x macro-avg F1 on the validation set after 10 epochs and the default configuration.

# Expectations

## What to write in code:

Like assignments 2 & 3 you have `TODO(Students) Start` and `TODO(Students) End` annotations. You are expected to write your code between those comment/annotations.

1. Implement a bi-directional GRU with attention (`MyBasicAttentiveBiGRU`) in `model.py`.
2. Include L2 regularization on training on all trainable varaibles.
3. Implement a new model (`MyAdvancedModel`) that out-performs the GRU in `model.py`. Adapt the `train_advanced.py` as required.

Among these, `MyBasicAttentiveBiGRU` and L2 regularization are well-defined tasks whereas `MyAdvancedModel` is an open-ended part of the assignment where you can design your own model as you deem fit.

## What experiments to try with models

This assignment is much more open ended than the others. What we're looking for is that you've learned enough from the course to tackle a problem on your own. However, we do suggest a couple of experiments with the GRU:

1. Run with only Word Embeddings (remove `pos_inputs` and dependency structure. Removing dep structure can be done by setting `shortest_path = []` in `data.py`)
2. Run with only Word + Pos embeddings
3. Run with only Word + Dep structure

## What to turn in?

A single zip file containing the following files:

1. model.py
2. train_advanced.py
3. train_lib.py
4. basic_test_prediction.txt
5. advanced_test_prediction_1.txt
6. advanced_test_prediction_2txt
7. advanced_test_prediction_3.txt

`gdrive_link.txt` should have a link to the `serialization_dirs.zip` of your trained models.

We will release the exact zip format on piazza in a couple of days but it should largely be the same as assignment 3.

### Good Luck!