https://github.com/amundfr/bert_ned
Named Entity Disambiguation using pretrained BERT word embeddings. Uses spaCy for Named Entity Recognition and candidate generation.
https://github.com/amundfr/bert_ned
entity-linking named-entity-disambiguation nlp
Last synced: 12 months ago
JSON representation
Named Entity Disambiguation using pretrained BERT word embeddings. Uses spaCy for Named Entity Recognition and candidate generation.
- Host: GitHub
- URL: https://github.com/amundfr/bert_ned
- Owner: amundfr
- Created: 2020-11-23T12:12:51.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-26T09:28:47.000Z (about 5 years ago)
- Last Synced: 2024-03-15T14:29:26.437Z (over 2 years ago)
- Topics: entity-linking, named-entity-disambiguation, nlp
- Language: Python
- Homepage:
- Size: 250 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# BERT for Named Entity Disambiguation — bert_ned
## TL;DR
To build and run the container:
```bash
make wharfer-build && make wharfer-run
```
To run the full pipeline (data generation, training, evaluation) inside the container:
```bash
make full
```
## Makefile for Docker/Wharfer
There are **shorthands** for the preceding Docker and Wharfer commands in a Makefile. Navigate to the directory with this repository and type `make help` for a list of instructions.
## Docker
You can build and run containers with specific volumes or names if you disregard the 'make' instructions.
Run the image on a machine with a Cuda-enabled GPU to train the model or use the model for inference.
Build the Docker image with GPU (7.27GB):
```bash
docker build -t bert_ned .
```
For completeness, there is a CPU version of the Dockerfile (`Dockerfile.CPU`). The CPU container works for the data preparation scripts, but the model runs poorly on a CPU.
Build without GPU (3.19GB):
```bash
docker build -f Dockerfile.CPU -t bert_ned_cpu .
```
## Docker run
To run the Docker image on an AD machine using files from /nfs/:
```bash
docker run -v /nfs/students/amund-faller-raheim/master_project_bert_ned/data:/bert_ned/data \
-v /nfs/students/amund-faller-raheim/master_project_bert_ned/ex_data:/bert_ned/ex_data \
-v /nfs/students/amund-faller-raheim/master_project_bert_ned/models:/bert_ned/models \
-it --name bert_ned bert_ned
```
Please note: accessing files over NFS makes some of the operations quite slow. You can also copy the directories in /nfs/students/amund-faller-raheim/master_project_bert_ned to a local directory and mount those to the docker container.
E.g. `cp /nfs/students/amund-faller-raheim/master_project_bert_ned /local/data/$(whoami)/` and
```
docker run -v /local/data/$(whoami)/data:/bert_ned/data \
-v /local/data/$(whoami)/ex_data:/bert_ned/ex_data \
-v /local/data/$(whoami)/models:/bert_ned/models \
-it --name bert_ned bert_ned
```
## Run scripts in container
Instructions should appear **once the container is running**. Type 'make help' to see a list of actions. (This is from a second makefile `Makefile_scripts`.)
You can use 'make' to **run scripts** in the container. For example:
```bash
make full
```
Which runs the script `bert_ned_full_pipeline.py`. This script can do the full process, from data generation to training and evaluation.
When running the scripts, make sure that the settings in `config.ini` are correct for your environment (or the container, by default). **Leave the settings as provided to reproduce results.**