Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rmodi6/relation-extraction
RNN and CNN based Relation Extraction models in tensorflow 2.0
https://github.com/rmodi6/relation-extraction
cnn functional-api gru information-extraction keras natural-language-processing relation-extraction rnn tensorflow2
Last synced: about 2 months ago
JSON representation
RNN and CNN based Relation Extraction models in tensorflow 2.0
- Host: GitHub
- URL: https://github.com/rmodi6/relation-extraction
- Owner: rmodi6
- License: mit
- Created: 2019-11-16T18:46:06.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2024-08-27T17:03:38.000Z (4 months ago)
- Last Synced: 2024-08-27T18:49:20.544Z (4 months ago)
- Topics: cnn, functional-api, gru, information-extraction, keras, natural-language-processing, relation-extraction, rnn, tensorflow2
- Language: Python
- Homepage:
- Size: 19.2 MB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
This file is best viewed in Markdown reader (eg. https://jbt.github.io/markdown-editor/)
# Overview
You will implement a bi-directional GRU as well as an original model for Relation Extraction:
The GRU is loosely based on the approach done in the work of Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification (Zhou et. al, 2016).
You will need to implement:
1. Bidirectional GRU
2. Attention layer
3. L2 RegularizationAdditionally, you will design your own network architecture to solve this task. You can think of it yourself or base it on a known solution from a paper.
More details are given in the assignment pdf.# Installation
The environment is same as past ones minus extra requirements in `requirements.txt`. But we would *strongly* encourage you to make a new environment for assignment 4.
This assignment is implemented in python 3.6 and tensorflow 2.0. Follow these steps to setup your environment:
1. [Download and install Conda](http://https://conda.io/projects/conda/en/latest/user-guide/install/index.html "Download and install Conda")
2. Create a Conda environment with Python 3.6```
conda create -n nlp-hw4 python=3.6
```3. Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use this code.
```
conda activate nlp-hw4
```
4. Install the requirements:
```
pip install -r requirements.txt
```5. Download spacy model
```
python -m spacy download en_core_web_sm
```6. Download glove wordvectors:
```
./download_glove.sh
```**NOTE:** We will be using this environment to check your code, so please don't work in your default or any other python environment.
# Data
You are given training and validation data in the form of text files. Training can be found in `data/train.txt` and validation is in `data/val.txt`. We are using data from a previous SemEval shared task which in total had 8,000 training examples. Your train/validation examples are a 90/10 split from this original 8,000. More details of the data can be found in the overview paper SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals (Hendrickx et. al, 2009) as well as extra PDFs explaining the details of each relation in the dataset directory.
# Code Overview
This repository largely follows the same interface as assignment 2 and assignment 3. Currently, the only thing missing is tensorboard support.
## Train and Predict
You have 4 main scripts in the repository `train_basic.py`, `train_advanced.py`, `predict.py` and `evaluate.pl`.
- Train scripts do as described and saves your model to be used later for prediction. Basic training script trains the basic `MyBasicAttentiveBiGRU` model which you are supposed implement (Bi-RNN+attention). You should not need to change this script. Advanced training script on other hand, is a template/starter code which you can adapt based on your `MyAdvancedModel` architecture design.
- Predict generates predictions on the test set `test.txt` and saves your output to a file. You will submit your predictions to be scored against the hidden (labels) test set. Both files are set with reasonable defaults in their arguments but example commands are shown below.
- Evaluation script is the pearl script unlike others. You can use it see detailed report of your predictions file against the gold labels.
#### Train a model
```
python train.py --embed-file embeddings/glove.6B.100D.txt --embed-dim 100 --batch-size 10 --num_epochs 5# stores the model by default at : serialization_dirs/basic/
```#### Predict with model
```
python predict.py --prediction-file my_predictions.txt --batch-size 10
```**NOTE:** These scripts will not work until you fill-up the placeholders (TODOs) left out as part of the assignment.
## Extra Scripts
As usual you are given `download_glove.sh`, however this is a slightly edited version that does not remove the larger dimension embeddings. For this assignment I personally used the 100D embeddings. I encourage everyone to do the same for their GRU to confirm your results are around the expected range.
# General results
To help everyone confirm if their model is on the right track I will say that my bi-GRU w/ attention model scored high .5x macro-avg F1 on the validation set after 10 epochs and the default configuration.
# Expectations
## What to write in code:
Like assignments 2 & 3 you have `TODO(Students) Start` and `TODO(Students) End` annotations. You are expected to write your code between those comment/annotations.
1. Implement a bi-directional GRU with attention (`MyBasicAttentiveBiGRU`) in `model.py`.
2. Include L2 regularization on training on all trainable varaibles.
3. Implement a new model (`MyAdvancedModel`) that out-performs the GRU in `model.py`. Adapt the `train_advanced.py` as required.Among these, `MyBasicAttentiveBiGRU` and L2 regularization are well-defined tasks whereas `MyAdvancedModel` is an open-ended part of the assignment where you can design your own model as you deem fit.
## What experiments to try with models
This assignment is much more open ended than the others. What we're looking for is that you've learned enough from the course to tackle a problem on your own. However, we do suggest a couple of experiments with the GRU:
1. Run with only Word Embeddings (remove `pos_inputs` and dependency structure. Removing dep structure can be done by setting `shortest_path = []` in `data.py`)
2. Run with only Word + Pos embeddings
3. Run with only Word + Dep structure## What to turn in?
A single zip file containing the following files:
1. model.py
2. train_advanced.py
3. train_lib.py
4. basic_test_prediction.txt
5. advanced_test_prediction_1.txt
6. advanced_test_prediction_2txt
7. advanced_test_prediction_3.txt`gdrive_link.txt` should have a link to the `serialization_dirs.zip` of your trained models.
We will release the exact zip format on piazza in a couple of days but it should largely be the same as assignment 3.
### Good Luck!