https://github.com/guillaumegenthial/sequence_tagging

Named Entity Recognition (LSTM + CRF) - Tensorflow
https://github.com/guillaumegenthial/sequence_tagging

bi-lstm characters-embeddings conditional-random-fields crf glove named-entity-recognition ner state-of-art tensorflow

Last synced: about 1 month ago
JSON representation

Named Entity Recognition (LSTM + CRF) - Tensorflow

Host: GitHub
URL: https://github.com/guillaumegenthial/sequence_tagging
Owner: guillaumegenthial
License: apache-2.0
Created: 2017-03-29T15:53:29.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2020-10-16T09:18:22.000Z (over 4 years ago)
Last Synced: 2025-04-07T21:12:43.597Z (2 months ago)
Topics: bi-lstm, characters-embeddings, conditional-random-fields, crf, glove, named-entity-recognition, ner, state-of-art, tensorflow
Language: Python
Homepage: https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
Size: 54.7 KB
Stars: 1,944
Watchers: 72
Forks: 701
Open Issues: 22
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

# Named Entity Recognition with Tensorflow

This repo implements a NER model using Tensorflow (LSTM + CRF + chars embeddings).

__A [better implementation is available here, using `tf.data` and `tf.estimator`, and achieves an F1 of 91.21](https://github.com/guillaumegenthial/tf_ner)__

State-of-the-art performance (F1 score between 90 and 91).

Check the [blog post](https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html)

## Task

Given a sentence, give a tag to each word. A classical application is Named Entity Recognition (NER). Here is an example

```
John lives in New York
B-PER O O B-LOC I-LOC
```

## Model

Similar to [Lample et al.](https://arxiv.org/abs/1603.01360) and [Ma and Hovy](https://arxiv.org/pdf/1603.01354.pdf).

- concatenate final states of a bi-lstm on character embeddings to get a character-based representation of each word
- concatenate this representation to a standard word vector representation (GloVe here)
- run a bi-lstm on each sentence to extract contextual representation of each word
- decode with a linear chain CRF

## Getting started

1. Download the GloVe vectors with

```
make glove
```

Alternatively, you can download them manually [here](https://nlp.stanford.edu/projects/glove/) and update the `glove_filename` entry in `config.py`. You can also choose not to load pretrained word vectors by changing the entry `use_pretrained` to `False` in `model/config.py`.

2. Build the training data, train and evaluate the model with
```
make run
```

## Details

Here is the breakdown of the commands executed in `make run`:

1. [DO NOT MISS THIS STEP] Build vocab from the data and extract trimmed glove vectors according to the config in `model/config.py`.

```
python build_data.py
```

2. Train the model with

```
python train.py
```

3. Evaluate and interact with the model with
```
python evaluate.py
```

Data iterators and utils are in `model/data_utils.py` and the model with training/test procedures is in `model/ner_model.py`

Training time on NVidia Tesla K80 is 110 seconds per epoch on CoNLL train set using characters embeddings and CRF.

## Training Data

The training data must be in the following format (identical to the CoNLL2003 dataset).

A default test file is provided to help you getting started.

```
John B-PER
lives O
in O
New B-LOC
York I-LOC
. O

This O
is O
another O
sentence
```

Once you have produced your data files, change the parameters in `config.py` like

```
# dataset
dev_filename = "data/coNLL/eng/eng.testa.iob"
test_filename = "data/coNLL/eng/eng.testb.iob"
train_filename = "data/coNLL/eng/eng.train.iob"
```

## License

This project is licensed under the terms of the apache 2.0 license (as Tensorflow and derivatives). If used for research, citation would be appreciated.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/guillaumegenthial/sequence_tagging

Awesome Lists containing this project

README