https://github.com/mberr/ea-sota-comparison

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)
https://github.com/mberr/ea-sota-comparison

entity-alignment graph-neural-network knowledge-graph word-embedding-evaluation

Last synced: about 1 month ago
JSON representation

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

Host: GitHub
URL: https://github.com/mberr/ea-sota-comparison
Owner: mberr
License: mit
Created: 2020-10-30T14:44:54.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2023-04-01T10:06:19.000Z (about 2 years ago)
Last Synced: 2025-03-24T12:56:18.825Z (about 2 months ago)
Topics: entity-alignment, graph-neural-network, knowledge-graph, word-embedding-evaluation
Language: Python
Homepage:
Size: 107 KB
Stars: 16
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # A Critical Assessment of State-of-the-Art in Entity Alignment

[![Arxiv](https://img.shields.io/badge/arXiv-2010.16314-b31b1b)](https://arxiv.org/abs/2010.16314)

[![Python 3.8](https://img.shields.io/badge/Python-3.8-2d618c?logo=python)](https://docs.python.org/3.8/)

[![PyTorch](https://img.shields.io/badge/Made%20with-PyTorch-ee4c2c?logo=pytorch)](https://pytorch.org/docs/stable/index.html)

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

This repository contains the source code for the paper

```

A Critical Assessment of State-of-the-Art in Entity Alignment

Max Berrendorf, Ludwig Wacker, and Evgeniy Faerman

https://arxiv.org/abs/2010.16314

```

# Installation

Setup and activate virtual environment:

```shell script

python3.8 -m venv ./venv

source ./venv/bin/activate

```

Install requirements (in this virtual environment):

```shell script

pip install -U pip

pip install -U -r requirements.txt

```

In order to run the DGMC scripts, you additionally need to setup 

its requirements as described in the corresponding GitHub repository's 

[README](https://github.com/rusty1s/deep-graph-matching-consensus/blob/a25f89751f4a3a0d509baa6bbada8b4153c635f6/README.md).

We do not include them into [`requirements.txt`](./requirements.txt), 

since their installation is a bit more involved, including non-Python dependencies. 

# Preparation

## MLFlow

In order to track results to a MLFlow server, start it first by running

```shell script

mlflow server

```

_Note: When storing the result for many configurations, we recommend to setup a

database backend following the [instructions](https://mlflow.org/docs/latest/tracking.html)._

For the following examples, we assume that the server is running at

```shell script

TRACKING_URI=http://localhost:5000

```

## OpenEA RDGCN embeddings

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6985518.svg)](https://doi.org/10.5281/zenodo.6985518)

Please download the RDGCN embeddings extracted with the [OpenEA codebase](https://github.com/nju-websoft/OpenEA/tree/2a6e0b03ec8cdcad4920704d1c38547a3ad72abe)

from [here](https://doi.org/10.5281/zenodo.6985518)

and place them in `~/.kgm/openea_rdgcn_embeddings`.

They have a file name matching the pattern `*_*_15K_V2.pt` and require in total around 160MiB storage.

## BERT initialization

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6985518.svg)](https://doi.org/10.5281/zenodo.6985518)

To generate data for the BERT-based initialization, run

```shell script

(venv) PYTHONPATH=./src python3 executables/prepare_bert.py

```

We also provide preprocessed files at [this url](https://doi.org/10.5281/zenodo.6985518).

If you prefer to use those, please download and place them in `~/.kgm/bert_prepared`. 

They have a file name matching `*_bert-base-multilingual-cased_*` and require in total around 6.1GiB storage. 

# Experiments

For all experiments the results are logged to the running MLFlow instance.

_Note: The hyperparameter searches takes a significant amount of time (~multiple days),

 and requires access to GPU(s). You can abort the script at any time, and inspect the

  current results via the web interface of MLFlow._

## Zero-Shot

For the zero-shot evaluation run

```shell script

(venv) PYTHONPATH=./src python3 executables/zero_shot.py --tracking_uri=${TRACKING_URI} 

```

## GCN-Align

To run the hyperparameter search run

```shell script

(venv) PYTHONPATH=./src python3 executables/tune_gcn_align.py --tracking_uri=${TRACKING_URI} 

```

## RDGCN

To run the hyperparameter search run

```shell script

(venv) PYTHONPATH=./src python3 executables/tune_rdgcn.py --tracking_uri=${TRACKING_URI} 

```

## DGMC

To run the hyperparameter search run

```shell script

(venv) PYTHONPATH=./src python3 executables/tune_dgmc.py  --tracking_uri=${TRACKING_URI} 

```

# Evaluation

To summarize the dataset statistics run

```shell script

(venv) PYTHONPATH=./src python3 executables/summarize.py --target datasets --force

```

To summarize all experiments run

```shell script

(venv) PYTHONPATH=./src python3 executables/summarize.py --target results --tracking_uri=${TRACKING_URI} --force

```

To generate the ablation study table run

```shell script

(venv) PYTHONPATH=./src python3 executables/summarize.py --target ablation --tracking_uri=${TRACKING_URI} --force

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mberr/ea-sota-comparison

Awesome Lists containing this project

README