Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mberr/ea-sota-comparison
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
https://github.com/mberr/ea-sota-comparison
entity-alignment graph-neural-network knowledge-graph word-embedding-evaluation
Last synced: about 1 month ago
JSON representation
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
- Host: GitHub
- URL: https://github.com/mberr/ea-sota-comparison
- Owner: mberr
- License: mit
- Created: 2020-10-30T14:44:54.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2023-04-01T10:06:19.000Z (over 1 year ago)
- Last Synced: 2024-05-02T02:50:35.447Z (7 months ago)
- Topics: entity-alignment, graph-neural-network, knowledge-graph, word-embedding-evaluation
- Language: Python
- Homepage:
- Size: 107 KB
- Stars: 16
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# A Critical Assessment of State-of-the-Art in Entity Alignment
[![Arxiv](https://img.shields.io/badge/arXiv-2010.16314-b31b1b)](https://arxiv.org/abs/2010.16314)
[![Python 3.8](https://img.shields.io/badge/Python-3.8-2d618c?logo=python)](https://docs.python.org/3.8/)
[![PyTorch](https://img.shields.io/badge/Made%20with-PyTorch-ee4c2c?logo=pytorch)](https://pytorch.org/docs/stable/index.html)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)This repository contains the source code for the paper
```
A Critical Assessment of State-of-the-Art in Entity Alignment
Max Berrendorf, Ludwig Wacker, and Evgeniy Faerman
https://arxiv.org/abs/2010.16314
```# Installation
Setup and activate virtual environment:
```shell script
python3.8 -m venv ./venv
source ./venv/bin/activate
```Install requirements (in this virtual environment):
```shell script
pip install -U pip
pip install -U -r requirements.txt
```In order to run the DGMC scripts, you additionally need to setup
its requirements as described in the corresponding GitHub repository's
[README](https://github.com/rusty1s/deep-graph-matching-consensus/blob/a25f89751f4a3a0d509baa6bbada8b4153c635f6/README.md).
We do not include them into [`requirements.txt`](./requirements.txt),
since their installation is a bit more involved, including non-Python dependencies.# Preparation
## MLFlow
In order to track results to a MLFlow server, start it first by running
```shell script
mlflow server
```
_Note: When storing the result for many configurations, we recommend to setup a
database backend following the [instructions](https://mlflow.org/docs/latest/tracking.html)._
For the following examples, we assume that the server is running at
```shell script
TRACKING_URI=http://localhost:5000
```## OpenEA RDGCN embeddings
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6985518.svg)](https://doi.org/10.5281/zenodo.6985518)Please download the RDGCN embeddings extracted with the [OpenEA codebase](https://github.com/nju-websoft/OpenEA/tree/2a6e0b03ec8cdcad4920704d1c38547a3ad72abe)
from [here](https://doi.org/10.5281/zenodo.6985518)
and place them in `~/.kgm/openea_rdgcn_embeddings`.
They have a file name matching the pattern `*_*_15K_V2.pt` and require in total around 160MiB storage.## BERT initialization
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6985518.svg)](https://doi.org/10.5281/zenodo.6985518)To generate data for the BERT-based initialization, run
```shell script
(venv) PYTHONPATH=./src python3 executables/prepare_bert.py
```We also provide preprocessed files at [this url](https://doi.org/10.5281/zenodo.6985518).
If you prefer to use those, please download and place them in `~/.kgm/bert_prepared`.
They have a file name matching `*_bert-base-multilingual-cased_*` and require in total around 6.1GiB storage.# Experiments
For all experiments the results are logged to the running MLFlow instance.
_Note: The hyperparameter searches takes a significant amount of time (~multiple days),
and requires access to GPU(s). You can abort the script at any time, and inspect the
current results via the web interface of MLFlow._## Zero-Shot
For the zero-shot evaluation run
```shell script
(venv) PYTHONPATH=./src python3 executables/zero_shot.py --tracking_uri=${TRACKING_URI}
```## GCN-Align
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_gcn_align.py --tracking_uri=${TRACKING_URI}
```## RDGCN
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_rdgcn.py --tracking_uri=${TRACKING_URI}
```## DGMC
To run the hyperparameter search run
```shell script
(venv) PYTHONPATH=./src python3 executables/tune_dgmc.py --tracking_uri=${TRACKING_URI}
```# Evaluation
To summarize the dataset statistics run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target datasets --force
```To summarize all experiments run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target results --tracking_uri=${TRACKING_URI} --force
```To generate the ablation study table run
```shell script
(venv) PYTHONPATH=./src python3 executables/summarize.py --target ablation --tracking_uri=${TRACKING_URI} --force
```