https://github.com/tshu-w/ember

Code and data for the paper "Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction" (IJCAI 2022)
https://github.com/tshu-w/ember

benchmark entity-matching entity-resolution ijcai2022

Last synced: 7 days ago
JSON representation

Code and data for the paper "Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction" (IJCAI 2022)

Host: GitHub
URL: https://github.com/tshu-w/ember
Owner: tshu-w
Created: 2021-07-28T11:34:49.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2025-07-22T08:19:09.000Z (6 months ago)
Last Synced: 2025-10-13T03:34:34.440Z (3 months ago)
Topics: benchmark, entity-matching, entity-resolution, ijcai2022
Language: Python
Homepage: https://tshu-w.github.io/ember/
Size: 29.8 MB
Stars: 6
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- Citation: CITATION.bib

Awesome Lists containing this project

README

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction

## Description
Code and data for the paper:

*Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction*

## Data
Details of the released data can be found in the [REAME](./data/ali/README.md) of the data.

## How to run
First, install dependencies
```console
# clone project
git clone https://github.com/tshu-w/EMBer
cd EMBer

# [SUGGESTED] use conda environment
conda env create -n ember -f environment.yaml
conda activate ember

# [ALTERNATIVE] install requirements directly
pip install -r requirements.txt
```

Next, to obtain the main results of the paper:
```console
bash scripts/download_images.sh

python scripts/run_ali.py --gpus 0 1 2 3
python scripts/test_ali.py --gpus 0 1 2 3
python scripts/run_dm_ali.py --gpus 0 1 2 3
python scripts/test_dm_ali.py --gpus 0 1 2 3

python scripts/print_results results/test -k test/f1 test/prc test/rec
```

You can also run experiments with the `run` script.
```console
# fit with the TextMatcher config
./run fit --config configs/ali_tm.yaml
# or specific command line arguments
./run fit --model TextMatcher --data AliDataModule --data.batch_size 32 --trainer.gpus 0,

# evaluate with the checkpoint
./run test --config configs/ali_tm.yaml --ckpt_path ckpt_path

# get the script help
./run --help
./run fit --help
```

## Citation
```
@inproceedings{ijcai2022p0552,
title = {Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction},
author = {Wang, Tianshu and Lin, Hongyu and Fu, Cheng and Han, Xianpei and Sun, Le and Xiong, Feiyu and Chen, Hui and Lu, Minlong and Zhu, Xiuwen},
booktitle = {Proceedings of the Thirty-First International Joint Conference on
Artificial Intelligence, {IJCAI-22}},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
editor = {Lud De Raedt},
pages = {3978--3984},
year = {2022},
month = {7},
note = {Main Track},
doi = {10.24963/ijcai.2022/552},
url = {https://doi.org/10.24963/ijcai.2022/552},
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tshu-w/ember

Awesome Lists containing this project

README

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction

https://github.com/tshu-w/ember

Awesome Lists containing this project

README

Bridging the Gap between Reality and Ideality of Entity Matching:A Revisiting and Benchmark Re-Construction

Bridging the Gap between Reality and Ideality of Entity Matching:
A Revisiting and Benchmark Re-Construction