https://github.com/mberr/ea-sota-comparison
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
https://github.com/mberr/ea-sota-comparison
entity-alignment graph-neural-network knowledge-graph word-embedding-evaluation
Last synced: 10 months ago
JSON representation
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
- Host: GitHub
- URL: https://github.com/mberr/ea-sota-comparison
- Owner: mberr
- License: mit
- Created: 2020-10-30T14:44:54.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2023-04-01T10:06:19.000Z (about 3 years ago)
- Last Synced: 2025-04-10T14:35:34.189Z (about 1 year ago)
- Topics: entity-alignment, graph-neural-network, knowledge-graph, word-embedding-evaluation
- Language: Python
- Homepage:
- Size: 107 KB
- Stars: 16
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# A Critical Assessment of State-of-the-Art in Entity Alignment
[](https://arxiv.org/abs/2010.16314)
[](https://docs.python.org/3.8/)
[](https://pytorch.org/docs/stable/index.html)
[](https://opensource.org/licenses/MIT)
This repository contains the source code for the paper
```
A Critical Assessment of State-of-the-Art in Entity Alignment
Max Berrendorf, Ludwig Wacker, and Evgeniy Faerman
https://arxiv.org/abs/2010.16314
```
# Installation
Setup and activate virtual environment:
```shell script
python3.8 -m venv ./venv
source ./venv/bin/activate
```
Install requirements (in this virtual environment):
```shell script
pip install -U pip
pip install -U -r requirements.txt
```
In order to run the DGMC scripts, you additionally need to setup
its requirements as described in the corresponding GitHub repository's
[README](https://github.com/rusty1s/deep-graph-matching-consensus/blob/a25f89751f4a3a0d509baa6bbada8b4153c635f6/README.md).
We do not include them into [`requirements.txt`](./requirements.txt),
since their installation is a bit more involved, including non-Python dependencies.
# Preparation
## MLFlow
In order to track results to a MLFlow server, start it first by running
```shell script
mlflow server
```
_Note: When storing the result for many configurations, we recommend to setup a
database backend following the [instructions](https://mlflow.org/docs/latest/tracking.html)._
For the following examples, we assume that the server is running at
```shell script
TRACKING_URI=http://localhost:5000
```
## OpenEA RDGCN embeddings
[](https://doi.org/10.5281/zenodo.6985518)
Please download the RDGCN embeddings extracted with the [OpenEA codebase](https://github.com/nju-websoft/OpenEA/tree/2a6e0b03ec8cdcad4920704d1c38547a3ad72abe)
from [here](https://doi.org/10.5281/zenodo.6985518)
and place them in `~/.kgm/openea_rdgcn_embeddings`.
They have a file name matching the pattern `*_*_15K_V2.pt` and require in total around 160MiB storage.
## BERT initialization
[](https://doi.org/10.5281/zenodo.6985518)
To generate data for the BERT-based initialization, run
```shell script
(venv) PYTHONPATH=./src python3 executables/prepare_bert.py
```
We also provide preprocessed files at [this url](https://doi.org/10.5281/zenodo.6985518).
If you prefer to use those, please download and place them in `~/.kgm/bert_prepared`.
They have a file name matching `*_bert-base-multilingual-cased_*` and require in total around 6.1GiB storage.
# Experiments
For all experiments the results are logged to the running MLFlow instance.
_Note: The hyperparameter searches takes a significant amount of time (~multiple days),
and requires access to GPU(s). You can abort the script at any time, and inspect the
current results via the web interface of MLFlow._
## Zero-Shot
For the zero-shot evaluation run
```shell script
(venv) PYTHONPATH=./src python3 executables/zero_shot.py --tracking_uri=${TRACKING_URI}
```
## GCN-Align
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_gcn_align.py --tracking_uri=${TRACKING_URI}
```
## RDGCN
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_rdgcn.py --tracking_uri=${TRACKING_URI}
```
## DGMC
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_dgmc.py --tracking_uri=${TRACKING_URI}
```
# Evaluation
To summarize the dataset statistics run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target datasets --force
```
To summarize all experiments run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target results --tracking_uri=${TRACKING_URI} --force
```
To generate the ablation study table run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target ablation --tracking_uri=${TRACKING_URI} --force
```