https://github.com/mberr/ea-active-learning

Code for paper "Active Learning for Entity Alignment" (https://arxiv.org/abs/2001.08943)
https://github.com/mberr/ea-active-learning

active-learning entity-alignment knowledge-graph

Last synced: 11 months ago
JSON representation

Code for paper "Active Learning for Entity Alignment" (https://arxiv.org/abs/2001.08943)

Host: GitHub
URL: https://github.com/mberr/ea-active-learning
Owner: mberr
License: mit
Created: 2020-12-28T11:00:21.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2023-05-01T14:12:15.000Z (about 3 years ago)
Last Synced: 2025-04-10T14:35:33.829Z (about 1 year ago)
Topics: active-learning, entity-alignment, knowledge-graph
Language: Python
Homepage:
Size: 81.1 KB
Stars: 6
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Active Learning for Entity Alignment

[![Arxiv](https://img.shields.io/badge/arXiv-2001.08943-b31b1b)](https://arxiv.org/abs/2001.08943)
[![Python 3.8](https://img.shields.io/badge/Python-3.8-2d618c?logo=python)](https://docs.python.org/3.8/)
[![PyTorch](https://img.shields.io/badge/Made%20with-PyTorch-ee4c2c?logo=pytorch)](https://pytorch.org/docs/stable/index.html)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

This repository contains the source code for the paper

```
Active Learning for Entity Alignment
Max Berrendorf*, Evgeniy Faerman*, and Volker Tresp
https://arxiv.org/abs/2001.08943
```

# Installation

Setup and activate a virtual environment:

```shell script
python3.8 -m venv ./venv
source ./venv/bin/activate
```

Install requirements (in this virtual environment):

```shell script
pip install -U pip
pip install -U -r requirements.txt
```

# Preparation

In order to track results to a MLFlow server, start it first by running

```shell script
mlflow server
```

_Note: When storing the result for many configurations, we recommend to setup a database backend following the [instructions](https://mlflow.org/docs/latest/tracking.html)._
For the following examples, we assume that the server is running at

```shell script
TRACKING_URI=http://localhost:5000
```

# Experiments

For all experiments the results are logged to the running MLFlow instance. You can inspect the results during training by accessing the `TRACKING_URI` through a browser.
Moreover, all experiments are synced via the MLFlow instance.
Thus, you can start multiple instances of each command on different worker machines to parallelize the experiment.

## Random Baseline

To run the random baseline use

```shell script
PYTHONPATH=./src python3 executables/evaluate_active_learning_heuristic.py --phase=random --tracking_uri=${TRACKING_URI}
```

## Hyperparameter Search

To run the hyperparameter search use

```shell script
PYTHONPATH=./src python3 executables/evaluate_active_learning_heuristic.py --phase=hpo --tracking_uri=${TRACKING_URI}
```

_Note: The hyperparameter searches takes a significant amount of time (~multiple days), and requires access to GPU(s). You can abort the script at any time, and inspect the current results via the web interface of MLFlow._

## Best Configurations

To rerun the best configurations we found in our hyperparameter search use

```shell script
PYTHONPATH=./src python3 executables/evaluate_active_learning_heuristic.py --phase=best --tracking_uri=${TRACKING_URI}
```

# Evaluation

To reproduce the tables and numbers of the paper use

```bash
PYTHONPATH=./src python3 executables/collate_results.py --tracking_uri=${TRACKING_URI}
```

To avoid re-downloading data from a remote MLFLow instance, the metrics and parameters get buffered. To enforce a re-download, e.g., since you conducted additional runs, use `--force`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mberr/ea-active-learning

Awesome Lists containing this project

README