{"id":28098968,"url":"https://github.com/thu-keg/eakit","last_synced_at":"2025-05-13T17:59:20.340Z","repository":{"id":96296385,"uuid":"256448870","full_name":"THU-KEG/EAkit","owner":"THU-KEG","description":"Entity Alignment toolkit (EAkit), a lightweight, easy-to-use and highly extensible PyTorch implementation of many entity alignment algorithms.","archived":false,"fork":false,"pushed_at":"2022-10-24T07:15:21.000Z","size":32272,"stargazers_count":166,"open_issues_count":0,"forks_count":22,"subscribers_count":12,"default_branch":"master","last_synced_at":"2023-10-20T23:28:58.055Z","etag":null,"topics":["entity-alignment","knowledge-embedding","knowledge-graph"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/THU-KEG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-04-17T08:41:24.000Z","updated_at":"2023-10-20T23:28:58.638Z","dependencies_parsed_at":"2023-04-18T08:04:09.615Z","dependency_job_id":null,"html_url":"https://github.com/THU-KEG/EAkit","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FEAkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FEAkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FEAkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FEAkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/THU-KEG","download_url":"https://codeload.github.com/THU-KEG/EAkit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000169,"owners_count":21997400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entity-alignment","knowledge-embedding","knowledge-graph"],"created_at":"2025-05-13T17:59:18.987Z","updated_at":"2025-05-13T17:59:20.310Z","avatar_url":"https://github.com/THU-KEG.png","language":"Python","readme":"# EAkit\n*Entity Alignment toolkit* (EAkit), a lightweight, easy-to-use and highly extensible PyTorch implementation of many entity alignment algorithms. The algorithm list is from [Entity_Alignment_Papers](https://github.com/THU-KEG/Entity_Alignment_Papers).\n\n**Table of Contents**\n1. [Design](#Design)\n2. [Organization](#Organization)\n3. [Usage](#Usage)\n    1. [Run an implemented model](#Run-an-implemented-model)\n        1. [Semantic Matching Models](#Semantic-Matching-Models)\n        2. [GNN-based Models](#GNN-based-Models)\n        3. [KE-based Models](#KE-based-Models)\n        4. [Results](#Results)\n    2. [Write a new model](#Write-a-new-model)\n4. [Dataset](#Dataset)\n5. [Reqirements](#Reqirements)\n6. [TODO](#TODO)\n7. [Acknowledgement](#Acknowledgement)\n\n\n## Design\nWe sort out the existing entity alignment algorithms and modularizing the composition of them, and then define an abstract structure as **1 Encoder - N Decoder(s)**, where different modules are regarded as specific implementations of different encoders and decoders, so as to restore the structures of the algorithms.\n\n![Framework of EAkit](examples/EAkit_framework.png)\n\n\n\n## Organization\n```\n./EAkit\n├── README.md                           # Doc of EAkit\n├── _runs                               # Tensorboard log dir\n├── data                                # Datasets. (unzip data.zip)\n│   └── DBP15K\n├── examples                            # Shell scripts of implemented algorithms\n│   ├── Tensorboard.sh                  # Start Tensorboard visualization\n│   ├── run_BootEA.sh\n│   ├── run_ComplEx.sh\n│   ├── run_ConvE.sh\n│   ├── run_DistMult.sh\n│   ├── run_GCN-Align.sh\n│   ├── run_HAKE.sh\n│   ├── run_KECG.sh\n│   ├── run_MMEA.sh\n│   ├── run_MTransE.sh\n│   ├── run_NAEA.sh\n│   ├── run_RotatE.sh\n│   ├── run_TransE.sh\n│   ├── run_TransEdge.sh\n│   ├── run_TransH.sh\n│   └── run_TransR.sh\n├── load_data.py                        # Load datasets. (data adapter)\n├── models.py                           # Encoders \u0026 Decoders\n├── run.py                              # Main\n├── semi_utils.py                       # Bootstrap strategy\n└── utils.py                            # Sampling methods, ...\n```\n\n\n\n## Usage\n\n### Run an implemented model\n\n1. Start TensorBoard for metrics visualization (run under `examples/`):\n```\n./Tensorboard.sh\n```\n\n2. Modify and run a script as follow (examples are under `examples/`):\n```\nCUDA_VISIBLE_DEVICES=0 python3 run.py --log gcnalign \\\n                                    --data_dir \"data/DBP15K/zh_en\" \\\n                                    --rate 0.3 \\\n                                    --epoch 1000 \\\n                                    --check 10 \\\n                                    --update 10 \\\n                                    --train_batch_size -1 \\\n                                    --encoder \"GCN-Align\" \\\n                                    --hiddens \"100,100,100\" \\\n                                    --decoder \"Align\" \\\n                                    --sampling \"N\" \\\n                                    --k \"25\" \\\n                                    --margin \"1\" \\\n                                    --alpha \"1\" \\\n                                    --feat_drop 0.0 \\\n                                    --lr 0.005 \\\n                                    --train_dist \"euclidean\" \\\n                                    --test_dist \"euclidean\"\n```\n\nIn detail, the following methods are currently implemented:\n\n\n#### Semantic Matching Models\n\n| Method |  | Encoder | Decoder |\n| ---- | ---- | ---- | ---- |\n| **[MTransE](https://www.ijcai.org/proceedings/2017/0209.pdf)** from Chen *et al.* (IJCAI 2017) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_MTransE.sh)\\], \\[[origin](https://github.com/muhaochen/MTransE)\\] | None | TransE, MTransE_Align |\n| **[BootEA](https://www.ijcai.org/proceedings/2018/0611.pdf)** from Sun *et al.* (IJCAI 2018) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_BootEA.sh)\\], \\[[origin](https://github.com/nju-websoft/BootEA)\\] | None | AlignEA |\n| **[TransEdge](https://link.springer.com/chapter/10.1007/978-3-030-30793-6_35)** from Sun *et al.* (ISWC 2019) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_TransEdge.sh)\\], \\[[origin](https://github.com/nju-websoft/TransEdge)\\] | None | TransEdge |\n| **[MMEA](https://www.aclweb.org/anthology/D19-1075.pdf)** from Shi *et al.* (EMNLP 2019) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_MMEA.sh)\\], [origin] | None | MMEA |\n\n\n#### GNN-based Models\n\n| Method |  | Encoder | Decoder |\n| ---- | ---- | ---- | ---- |\n| **[GCN-Align](https://www.aclweb.org/anthology/D18-1032.pdf)** from Wang *et al.* (EMNLP 2018)  | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_GCN-Align.sh)\\], \\[[origin](https://github.com/1049451037/GCN-Align)\\] | GCN-Align | Align |\n| **[NAEA](https://www.ijcai.org/proceedings/2019/0269.pdf)** from Zhu *et al.* (IJCAI 2019) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_NAEA.sh)\\], [origin] | NAEA | \\[N_TransE\\], N_TransE, N_R_Align |\n| **[KECG](https://www.aclweb.org/anthology/D19-1274.pdf)** from Li *et al.* (EMNLP 2019) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_KECG.sh)\\], \\[[origin](https://github.com/THU-KEG/KECG)\\] | KECG | TransE, Align |\n\n\n#### KE-based Models\n\n| Method |  | Encoder | Decoder |\n| ---- | ---- | ---- | ---- |\n| **[TransE](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523)** from Bordes *et al.* (NIPS 2013) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_TransE.sh)\\],  | None | TransE |\n| **[TransH](https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531/8546)** from Wang *et al.* (AAAI 2014) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_TransH.sh)\\],  | None | TransH |\n| **[TransR](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/viewFile/9571/9523)** from Lin *et al.* (AAAI 2015) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_TransR.sh)\\],  | None | TransR |\n| **[RotatE](https://openreview.net/pdf?id=HkgEQnRqYQ)** from Sun *et al.* (ICLR 2019) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_RotatE.sh)\\],  | None | RotatE |\n| **[HAKE](https://arxiv.org/pdf/1911.09419)** from Zhang *et al.* (AAAI 2020) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_HAKE.sh)\\],  | None | HAKE |\n| **[DistMult](https://arxiv.org/pdf/1412.6575)** from Yang *et al.* (ICLR 2015) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_DistMult.sh)\\],  | None | DistMult |\n| **[ComplEx](http://proceedings.mlr.press/v48/trouillon16.pdf)** from Trouillon *et al.* (ICML 2016) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_ComplEx.sh)\\],  | None | ComplEx |\n| **[ConvE](https://arxiv.org/pdf/1707.01476)** from Dettmers *et al.* (AAAI 2018) | \\[[sh](https://github.com/THU-KEG/EAkit/blob/master/examples/run_ConvE.sh)\\],  | None | ConvE |\n\n\n#### Results\nResults on DBP15K(zh_en, ja_en, fr_en).\n\n| | Hits@1 | Hits@10 | MRR | Hits@1 | Hits@10 | MRR | Hits@1 | Hits@10 | MRR |\n|-|-|-|-|-|-|-|-|-|-|\n| **MTransE** | 0.419 | 0.753 | 0.535 | 0.433 | 0.773 | 0.549 | 0.407 | 0.751 | 0.526 |\n| **BootEA** | 0.490 | 0.793 | 0.593 | 0.499 | 0.813 | 0.605 | 0.515 | 0.838 | 0.623 |\n| **TransEdge** | 0.519 | 0.813 | 0.621 | 0.526 | 0.825 | 0.632 | 0.397 | 0.824 | 0.543 |\n| **MMEA** | 0.405 | 0.672 | 0.499 | 0.397 | 0.680 | 0.496 | 0.442 | 0.749 | 0.550 |\n| **GCN-Align** | 0.410 | 0.756 | 0.527 | 0.442 | 0.810 | 0.566 | 0.430 | 0.813 | 0.557 |\n| **NAEA** | 0.323 | 0.481 | 0.381 | 0.311 | 0.457 | 0.363 | 0.307 | 0.460 | 0.362 |\n| **KECG** | 0.467 | 0.815 | 0.586 | 0.485 | 0.843 | 0.605 | 0.479 | 0.844 | 0.602 |\n| **TransE** | 0.343 | 0.634 | 0.441 | 0.365 | 0.710 | 0.480 | 0.374 | 0.735 | 0.493 |\n| **TransH** | 0.436 | 0.735 | 0.540 | 0.450 | 0.778 | 0.561 | 0.485 | 0.821 | 0.599 |\n| **TransR** | 0.371 | 0.697 | 0.481 | 0.368 | 0.709 | 0.484 | 0.378 | 0.741 | 0.497 |\n| **RotatE** | 0.423 | 0.754 | 0.534 | 0.448 | 0.785 | 0.561 | 0.439 | 0.800 | 0.560 |\n| **HAKE** | 0.288 | 0.588 | 0.391 | 0.319 | 0.607 | 0.421 | 0.319 | 0.638 | 0.428 |\n| **DistMult** | 0.180 | 0.400 | 0.255 | 0.058 | 0.179 | 0.099 | 0.095 | 0.285 | 0.157 |\n| **ComplEx** | 0.115 | 0.265 | 0.166 | 0.063 | 0.251 | 0.146 | 0.141 | 0.332 | 0.206 |\n| **ConvE** | 0.210 | 0.466 | 0.299 | 0.339 | 0.556 | 0.415 | 0.350 | 0.602 | 0.439 |\n\n\n### Write a new model\n1. Divide the algorithm at the abstract level to obtain the structure of 1 (or 0) Encoder and 1 (or more) Decoder(s).\n2. Register the modules and add extra parameters in the top-level encoder (class Encoder) and top-level decoder (class Decoder) in `models.py`.\n3. Implement the concrete encoding module (class Encoder_Instance) and decoding module(s) (class Decoder_Instance) according to the given template.\n4. Write an execution script (XXX.sh) with parameter settings to run the new model.\n5. (Adapt a new dataset in `load_data.py`, and add a new sampling strategy in `utils.py`.)\n\n![Example of writing a new model](examples/EAkit_eg.png)\n\n\n\n## Dataset\n(Currently, EAkit only supports DBP15K, but it is easy to adapt to other datasets.)\n\n- **DBP15K** is from the \"mapping\" folder of [JAPE](https://github.com/nju-websoft/JAPE)(But need to combine \"ref_ent_ids\" and \"sup_ent_ids\" into a single file named \"ill_ent_ids\")\n\nHere, you can directly unpack the data file after [downloading](https://1drv.ms/u/s!AmQC2vZKsxjzhyCstNUSt2QVQgzi?e=WRE0cA):\n```\nunzip data.zip\n```\n\n\n\n## Reqirements\n- Python3 (tested on 3.7.7)\n- [PyTorch](https://pytorch.org/) (tested on 1.4.0)\n- PyTorch Geometric ([PyG](https://github.com/rusty1s/pytorch_geometric)) (tested on 1.4.3)\n- [TensorBoard](https://www.tensorflow.org/tensorboard/) (tested on 2.0.2)\n- Numpy\n- Scipy\n- Scikit-learn\n- [Graph-tool](https://git.skewed.de/count0/graph-tool/wikis/installation-instructions) (if use bootstrapping)\n\n\n\n## TODO\n- [ ] Results of BootEA, TransEdge, MMEA, NAEA are not satisfactory, they need debug (maybe on the bootstrapping process).\n\nThere are still many algorithms that need to be implemented (integrated):\n- **Semantic Matching Models**: NTAM, AttrE, CEAFF, ...\n- **GNN-based Models**: AVR-GCN, AliNet, MRAEA, CG-MuAlign, RDGCN, HGCN, GMNN, ...\n- **KE-based Models**: TransD, CapsE, ...\n- **GAN-based Models**: SEA, AKE, ...\n- **Other Models**: OTEA, ...\n\nFind algorithms from [Entity_Alignment_Papers](https://github.com/THU-KEG/Entity_Alignment_Papers).\n\n[Pull requests](https://github.com/THU-KEG/EAkit/pulls) for **implementing algorithms** \u0026 **updating (reproducible) results with shell scripts** are welcome!\n\n\n\n\n## Acknowledgement\nWe refer to some codes of the following repos, and we appreciate for their great contributions: [PyTorch Geometric](https://github.com/rusty1s/pytorch_geometric), [BootEA](https://github.com/nju-websoft/BootEA), [TransEdge](https://github.com/nju-websoft/TransEdge), [AliNet](https://github.com/nju-websoft/AliNet), [TuckER](https://github.com/ibalazevic/TuckER). If we miss some, do please let us know in [Issues](https://github.com/THU-KEG/EAkit/issues).\n\nThis project is mainly contributed by [Chengjiang Li](https://github.com/iamlockelightning), [Kaisheng Zeng](https://github.com/alpc43), [Lei Hou](https://github.com/HLGreener), [Juanzi Li](http://keg.cs.tsinghua.edu.cn/persons/ljz/).\n\n## Citation\n\nIf you use the code, please cite the following [paper](https://www.sciencedirect.com/science/article/pii/S2666651021000036):\n\n```\n@article{zeng2021comprehensive,\n  title={A comprehensive survey of entity alignment for knowledge graphs},\n  author={Zeng, Kaisheng and Li, Chengjiang and Hou, Lei and Li, Juanzi and Feng, Ling},\n  journal={AI Open},\n  volume={2},\n  pages={1--13},\n  year={2021},\n  publisher={Elsevier}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthu-keg%2Feakit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthu-keg%2Feakit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthu-keg%2Feakit/lists"}