{"id":28676584,"url":"https://github.com/zjunlp/adakgc","last_synced_at":"2025-07-17T09:36:28.390Z","repository":{"id":178335292,"uuid":"586435525","full_name":"zjunlp/AdaKGC","owner":"zjunlp","description":"Code for the EMNLP2023 (Findings) paper \"Schema-adaptable Knowledge Graph Construction\"","archived":false,"fork":false,"pushed_at":"2024-01-28T16:57:42.000Z","size":222,"stargazers_count":14,"open_issues_count":0,"forks_count":2,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-01-28T17:40:41.395Z","etag":null,"topics":["adakgc","chatgpt","event-extraction","information-extraction","knowledge-graph","large-language-models","natural-language-processing","prefix","prompt-engineering","relational-triple-extraction","schema"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zjunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-01-08T05:43:52.000Z","updated_at":"2024-01-24T20:04:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"ee80c215-11d3-485d-8d96-3ef612751050","html_url":"https://github.com/zjunlp/AdaKGC","commit_stats":null,"previous_names":["zjunlp/adakgc"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zjunlp/AdaKGC","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAdaKGC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAdaKGC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAdaKGC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAdaKGC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zjunlp","download_url":"https://codeload.github.com/zjunlp/AdaKGC/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAdaKGC/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259732770,"owners_count":22903087,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adakgc","chatgpt","event-extraction","information-extraction","knowledge-graph","large-language-models","natural-language-processing","prefix","prompt-engineering","relational-triple-extraction","schema"],"created_at":"2025-06-13T23:05:14.328Z","updated_at":"2025-06-13T23:05:15.964Z","avatar_url":"https://github.com/zjunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003ch1 align=\"center\"\u003e 🎇AdaKGC \n\u003c/h1\u003e\n\u003cdiv align=\"center\"\u003e\n     \n   [![Awesome](https://awesome.re/badge.svg)]() \n   [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n   ![](https://img.shields.io/github/last-commit/zjunlp/AdaKGC?color=green) \n   ![](https://img.shields.io/badge/PRs-Welcome-red) \n\u003c/div\u003e\n\n## *👋 新闻!*\n\n- 论文代码[`Schema-adaptable Knowledge Graph Construction`](https://arxiv.org/abs/2305.08703).\n\n- 我们的工作已被EMNLP2023 Findings会议接受。\n\n\n## 🎉 快速链接\n\n- [*👋 新闻!*](#-新闻)\n- [🎉 快速链接](#-快速链接)\n- [🎈 环境依赖](#-环境依赖)\n- [🪄 模型](#-模型)\n- [🎏 数据集](#-数据集)\n- [⚾ 运行](#-运行)\n- [🎰 推理](#-推理)\n- [🏳‍🌈 Acknowledgment](#-acknowledgment)\n- [🚩 Papers for the Project \\\u0026 How to Cite](#-papers-for-the-project--how-to-cite)\n\n## 🎈 环境依赖\n\n\u003ca id=\"requirements\"\u003e\u003c/a\u003e\n\n要运行代码，您需要安装以下要求:\n\n```bash\nconda create -n adakgc python=3.8\npip install torch==1.8.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html\npip install -r requirements.txt\n\n```\n## 🪄 模型\n\n我们的模型tokenizer部分采用了UIE, 其他部分采用t5, 因此是个混合文件, 这里提供了下载链接, 请确保使用这个模型。 [hf_models/mix](https://drive.google.com/file/d/1CI66LlwTWI3qCUCh6InutmrcTxCRrFiK/view?usp=sharing)\n\n\n## 🎏 数据集\n\n\u003ca id=\"datasets-of-extraction-tasks\"\u003e\u003c/a\u003e\n\n数据集构造的详细信息请参见[Data Construction](./dataset_construct/README.md).\n\n您可以通过以下Google Drive链接找到数据集。\n\nDataset [ACE05](https://drive.google.com/file/d/14ESd_mjx8PG6E7ls3bxWYuNiPhYWBqlJ/view?usp=sharing)、[Few-NERD](https://drive.google.com/file/d/1K6ZZoJj_FofdqZSLgE6mlHHS3bLWM90Z/view?usp=sharing)、[NYT](https://drive.google.com/file/d/1_x8efbnt5ljaAtUIlqi3T_AVT3nZqoKT/view?usp=sharing)\n\n## ⚾ 运行\n\n\u003ca id=\"how-to-run\"\u003e\u003c/a\u003e\n\n```python\nmkdir hf_models\ncd hf_models\ngit lfs install\ngit clone https://huggingface.co/google/t5-v1_1-base\ncd ..\n\nmkdir output           # */AdaKGC/output\n```\n\n+ ### 实体识别任务\n\n  \u003ca id=\"ner\"\u003e\u003c/a\u003e\n\n```bash\n# Current path:  */AdaKGC\nmode=H\ndata_name=Few-NERD\ntask=entity\ndevice=0\nratio=0.8\nbash scripts/fine_prompt.bash --model=hf_models/mix --data=data/${data_name}_${mode}/iter_1 --output=output/${data_name}_${mode}_${ratio} --config=config/prompt_conf/Few-NERD.ini --device=${device} --negative_ratio=${ratio} --record2=data/${data_name}_${mode}/iter_7/record.schema  --use_prompt=True --init_prompt=True\n\n```\n\n`model`: 预训练的模型的名称或路径。\n\n`data`: 数据集的路径。\n\n`output`: 保存的微调检查点的路径，最终自动生成的输出路径`AdaKGC/output/ace05_event_H_e30_lr1e-4_b14_n0。\n\n`config`: 默认配置文件, 在`config/prompt_conf`目录下, 每个任务的配置不同。\n\n`mode`: 数据集模式（`H`、`V`、`M`或`R`）。\n\n`device`: CUDA_VISIBLE_DEVICES。\n\n`batch`: batch size。\n\n（有关详细的命令行参数，请参阅bash脚本和Python文件）\n\n\n\n\n+ ### 关系抽取任务\n\n  \u003ca id=\"re\"\u003e\u003c/a\u003e\n\n```bash\nmode=H\ndata_name=NYT\ntask=relation\ndevice=0\nratio=0.8\nbash scripts/fine_prompt.bash --model=hf_models/mix --data=data/${data_name}_${mode}/iter_1 --output=output/${data_name}_${mode}_${ratio} --config=config/prompt_conf/NYT.ini --device=${device} --negative_ratio=${ratio} --record2=data/${data_name}_${mode}/iter_7/record.schema  --use_prompt=True --init_prompt=True\n```\n\n+ ### 事件抽取任务\n\n  \u003ca id=\"ee\"\u003e\u003c/a\u003e\n\n```bash\nmode=H\ndata_name=ace05_event\ntask=event\ndevice=0\nratio=0.8\nbash scripts/fine_prompt.bash --model=hf_models/mix --data=data/${data_name}_${mode}/iter_1 --output=output/${data_name}_${mode}_${ratio} --config=config/prompt_conf/ace05_event.ini --device=${device} --negative_ratio=${ratio} --record2=data/${data_name}_${mode}/iter_7/record.schema  --use_prompt=True --init_prompt=True\n```\n\n## 🎰 推理\n\n\u003ca id=\"inference\"\u003e\u003c/a\u003e\n\n* 仅对单个数据集进行推理（例如`data/ace05_event_H/iter_1`）\n\n```bash\nmode=H\ndata_name=ace05_event\ntask=event\ndevice=0\nratio=0.8\npython3 inference.py --dataname=data/${data_name}/${data_name}_${mode}/iter_2 --t5_path=hf_models/mix --model=output/${data_name}_${mode}_${ratio} --task=${task} --cuda=${device} --mode=${mode} --use_prompt --use_ssi --prompt_len=80 --prompt_dim=512\n```\n\n`datasetname`: 要预测的数据集的路径(`ace05_event`、`NYT` or `Few-NERD`)。\n\n`model`: 前面训练后得到的模型的路径(训练阶段的output)。\n\n`t5_path`: 基座模型T5(训练阶段的model)。\n\n`task`: 任务类型(entity、relation、event)。\n\n`cuda`: CUDA_VISIBLE_DEVICES。\n\n`mode`: 数据集模式（`H`、`V`、`M`或`R`）。\n\n`use_ssi`、`use_prompt`、`prompt_len`、`prompt_dim`需要跟训练时保持一致, 可以在对应的配置文件config/prompt_conf/ace05_event.ini中查看并设置。\n\n\n* 在所有迭代数据集上的自动推理（即`data/iter_1/ace05_event_H`~`data/iter _7/ace05_event_H`）\n\n```bash\nmode=H\ndata_name=ace05_event\ntask=event\ndevice=0\nratio=0.8\npython3 inference_mul.py --dataname=data/${data_name}/${data_name}_${mode} --t5_path=hf_models/mix --model=output/${data_name}_${mode}_${ratio} --task=${task} --cuda=${device} --mode=${mode} --use_prompt --use_ssi --prompt_len=80 --prompt_dim=512\n```\n`use_ssi`、`use_prompt`、`prompt_len`、`prompt_dim`需要跟训练时保持一致。\n\n\n\n\n完整的过程，包括微调和推理（在\"scripts/run.bash\"中）：\n\n```bash\nmode=H\ndata_name=ace05_event\ntask=event\ndevice=0\nratio=0.8\nbash scripts/run_prompt.bash --model=hf_models/mix --data=data/${data_name}_${mode}/iter_1 --output=output/${data_name}_${mode}_${ratio} --config=config/prompt_conf/ace05_event.ini --device=${device} --negative_ratio=${ratio} --record2=data/${data_name}_${mode}/iter_7/record.schema --use_prompt=True --init_prompt=True\npython3 inference_mul.py --dataname=data/${data_name}/${data_name}_${mode} --t5_path=hf_models/mix --model=output/${data_name}_${mode}_${ratio} --task=${task} --cuda=${device} --mode=${mode} --use_prompt --use_ssi --prompt_len=80 --prompt_dim=512\n```\n\n\n\n| 指标                   | 定义                                                                                      | F1        |\n| --------------------- | ---------------------------------------------------------------------------------------- | --------- |\n| ent-(P/R/F1)          | 实体的Micro-F1分数(Entity Type, Entity Span)                                                       | spot-F1   |\n| rel-strict-(P/R/F1)   | 关系严格模式的Micro-F1分数(Relation Type, Arg1 Span, Arg1 Type, Arg2 Span, Arg2 Type) | asoc-F1 用于关系，spot-F1 用于实体 |\n| evt-trigger-(P/R/F1)  | 事件触发词的Micro-F1分数(Event Type, Trigger Span)                                                 | spot-F1   |\n| evt-role-(P/R/F1)     | 事件角色的Micro-F1分数 (Event Type, Arg Role, Arg Span)                                            | asoc-F1   |\n\noverall-F1指的是 spot-F1 和 asoc-F1 的总和，可能超100。                                             \n\n\n\n## 🏳‍🌈 Acknowledgment\n\n\u003ca id=\"acknowledgment\"\u003e\u003c/a\u003e\n\nPart of our code is borrowed from [UIE](https://github.com/universal-ie/UIE) and [UnifiedSKG](https://github.com/hkunlp/unifiedskg), many thanks.\n\n## 🚩 Papers for the Project \u0026 How to Cite\n\nIf you use or extend our work, please cite the paper as follows:\n\n```bibtex\n@article{DBLP:journals/corr/abs-2305-08703,\n  author       = {Hongbin Ye and\n                  Honghao Gui and\n                  Xin Xu and\n                  Huajun Chen and\n                  Ningyu Zhang},\n  title        = {Schema-adaptable Knowledge Graph Construction},\n  journal      = {CoRR},\n  volume       = {abs/2305.08703},\n  year         = {2023},\n  url          = {https://doi.org/10.48550/arXiv.2305.08703},\n  doi          = {10.48550/arXiv.2305.08703},\n  eprinttype    = {arXiv},\n  eprint       = {2305.08703},\n  timestamp    = {Wed, 17 May 2023 15:47:36 +0200},\n  biburl       = {https://dblp.org/rec/journals/corr/abs-2305-08703.bib},\n  bibsource    = {dblp computer science bibliography, https://dblp.org}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fadakgc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzjunlp%2Fadakgc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fadakgc/lists"}