{"id":18832937,"url":"https://github.com/declare-lab/relationprompt","last_synced_at":"2025-04-14T04:31:46.685Z","repository":{"id":38394726,"uuid":"433641930","full_name":"declare-lab/RelationPrompt","owner":"declare-lab","description":"This repository implements our ACL Findings 2022 research paper RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction. The goal of Zero-Shot Relation Triplet Extraction (ZeroRTE) is to extract relation triplets of the format (head entity, tail entity, relation), despite not having annotated data for the test relation labels.","archived":false,"fork":false,"pushed_at":"2023-06-22T11:34:49.000Z","size":68,"stargazers_count":131,"open_issues_count":11,"forks_count":16,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-27T18:21:35.740Z","etag":null,"topics":["data-augumentation","generative-models","relation-extraction"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/declare-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-12-01T01:23:01.000Z","updated_at":"2025-03-18T13:47:01.000Z","dependencies_parsed_at":"2022-07-14T03:20:42.643Z","dependency_job_id":null,"html_url":"https://github.com/declare-lab/RelationPrompt","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/declare-lab%2FRelationPrompt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/declare-lab%2FRelationPrompt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/declare-lab%2FRelationPrompt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/declare-lab%2FRelationPrompt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/declare-lab","download_url":"https://codeload.github.com/declare-lab/RelationPrompt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248821742,"owners_count":21166948,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-augumentation","generative-models","relation-extraction"],"created_at":"2024-11-08T01:59:34.261Z","updated_at":"2025-04-14T04:31:46.665Z","avatar_url":"https://github.com/declare-lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction\n\n[![PWC](https://img.shields.io/badge/PapersWithCode-Benchmark-%232cafb1)](https://paperswithcode.com/paper/relationprompt-leveraging-prompts-to-generate)\n[![Colab](https://img.shields.io/badge/Colab-Code%20Demo-%23fe9f00)](https://colab.research.google.com/drive/18lrKD30kxEUolQ61o5nzUJM0rvWgpbFK?usp=sharing)\n[![Jupyter](https://img.shields.io/badge/Jupyter-Notebook%20Demo-important)](https://github.com/declare-lab/RelationPrompt/blob/main/demo.ipynb)\n\nThis repository implements our ACL Findings 2022 research paper [RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction](https://aclanthology.org/2022.findings-acl.5/). \nThe goal of Zero-Shot Relation Triplet Extraction (ZeroRTE) is to extract relation triplets of the format `(head entity, tail entity, relation)`, despite not having annotated data for the test relation labels.\n\n![diagram](https://github.com/declare-lab/RelationPrompt/releases/download/v1.0.0/diagram.png)\n\n### Installation\n\n- Python 3.7\n- If your GPU uses CUDA 11, first install the specific PyTorch: `pip install torch==1.10.0 --extra-index-url https://download.pytorch.org/whl/cu113`\n- Install requirements: `pip install -r requirements.txt` or `conda env create --file environment.yml`\n- Download and extract the [datasets here](https://github.com/declare-lab/RelationPrompt/releases/download/v1.0.0/zero_rte_data.zip) to `outputs/data/splits/zero_rte`\n- [FewRel Pretrained Model](https://github.com/declare-lab/RelationPrompt/releases/download/v1.0.0/model_fewrel_unseen_10_seed_0.tar) (unseen=10, seed=0)\n- [Wiki-ZSL Pretrained Model](https://github.com/declare-lab/RelationPrompt/releases/download/v1.0.0/model_wiki_unseen_10_seed_0.tar) (unseen=10, seed=0)\n\n\n### Data Exploration | [![Colab](https://img.shields.io/badge/Colab-Code%20Demo-%23fe9f00)](https://colab.research.google.com/drive/18lrKD30kxEUolQ61o5nzUJM0rvWgpbFK#scrollTo=vw3NlKDddMIP\u0026line=2\u0026uniqifier=1)\n\n```\nfrom wrapper import Dataset\n\ndata = Dataset.load(path)\nfor s in data.sents:\n    print(s.tokens)\n    for t in s.triplets:\n        print(t.head, t.tail, t.label)\n```\n\n### Generate with Pretrained Model | [![Colab](https://img.shields.io/badge/Colab-Code%20Demo-%23fe9f00)](https://colab.research.google.com/drive/18lrKD30kxEUolQ61o5nzUJM0rvWgpbFK#scrollTo=tUFis82oGUAS\u0026line=1\u0026uniqifier=1)\n\n```\nfrom wrapper import Generator\n\nmodel = Generator(load_dir=\"gpt2\", save_dir=\"outputs/wrapper/fewrel/unseen_10_seed_0/generator\")\nmodel.generate(labels=[\"location\", \"religion\"], path_out=\"synthetic.jsonl\")\n```\n\n### Extract with Pretrained Model | [![Colab](https://img.shields.io/badge/Colab-Code%20Demo-%23fe9f00)](https://colab.research.google.com/drive/18lrKD30kxEUolQ61o5nzUJM0rvWgpbFK#scrollTo=eGxP3vVmID9W\u0026line=1\u0026uniqifier=1)\n\n```\nfrom wrapper import Extractor\n\nmodel = Extractor(load_dir=\"facebook/bart-base\", save_dir=\"outputs/wrapper/fewrel/unseen_10_seed_0/extractor_final\")\nmodel.predict(path_in=path_test, path_out=\"pred.jsonl\")\n```\n\n### Model Training | [![Colab](https://img.shields.io/badge/Colab-Code%20Demo-%23fe9f00)](https://colab.research.google.com/drive/18lrKD30kxEUolQ61o5nzUJM0rvWgpbFK#scrollTo=qi5PAW5ocjfj\u0026line=1\u0026uniqifier=1)\n\nTrain the Generator and Extractor models:\n```\nfrom pathlib import Path\nfrom wrapper import Generator, Extractor\n\ngenerator = Generator(\n    load_dir=\"gpt2\",\n    save_dir=str(Path(save_dir) / \"generator\"),\n)\nextractor = Extractor(\n    load_dir=\"facebook/bart-base\",\n    save_dir=str(Path(save_dir) / \"extractor\"),\n)\ngenerator.fit(path_train, path_dev)\nextractor.fit(path_train, path_dev)\n```\n\nGenerate synthetic data with relation triplets for test labels:\n```\ngenerator.generate(labels_test, path_out=path_synthetic)\n```\n\nTrain the final Extractor model using the synthetic data and predict on test sentences:\n```\nextractor_final = Extractor(\n    load_dir=str(Path(save_dir) / \"extractor\" / \"model\"),\n    save_dir=str(Path(save_dir) / \"extractor_final\"),\n)\nextractor_final.fit(path_synthetic, path_dev)\nextractor_final.predict(path_in=path_test, path_out=path_pred)\n```\n\n### Experiment Scripts\n\nRun training in [wrapper.py](https://github.com/declare-lab/RelationPrompt/blob/783f33c301813368a5a6e3bdbbe50c47df7647bf/wrapper.py#L370) (You can change \"fewrel\" to \"wiki\" or unseen to 5/10/15 or seed to 0/1/2/3/4):\n```\npython wrapper.py main \\\n--path_train outputs/data/splits/zero_rte/fewrel/unseen_10_seed_0/train.jsonl \\                                       \n--path_dev outputs/data/splits/zero_rte/fewrel/unseen_10_seed_0/dev.jsonl \\                                           \n--path_test outputs/data/splits/zero_rte/fewrel/unseen_10_seed_0/test.jsonl \\                                         \n--save_dir outputs/wrapper/fewrel/unseen_10_seed_0   \n```\n\nRun evaluation (Single-triplet setting)\n```\npython wrapper.py run_eval \\                                                                                               \n--path_model outputs/wrapper/fewrel/unseen_10_seed_0/extractor_final \\                                                  \n--path_test outputs/data/splits/zero_rte/fewrel/unseen_10_seed_0/test.jsonl \\\n--mode single\n```\n\nRun evaluation (Multi-triplet setting)\n```\npython wrapper.py run_eval \\                                                                                               \n--path_model outputs/wrapper/fewrel/unseen_10_seed_0/extractor_final \\                                                  \n--path_test outputs/data/splits/zero_rte/fewrel/unseen_10_seed_0/test.jsonl \\\n--mode multi\n```\n\n### Research Citation\nIf the code is useful for your research project, we appreciate if you cite the following [paper](https://aclanthology.org/2022.findings-acl.5/):\n```\n@inproceedings{chia-etal-2022-relationprompt,\n    title = \"{R}elation{P}rompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction\",\n    author = \"Chia, Yew Ken  and\n      Bing, Lidong  and\n      Poria, Soujanya  and\n      Si, Luo\",\n    booktitle = \"Findings of the Association for Computational Linguistics: ACL 2022\",\n    month = may,\n    year = \"2022\",\n    address = \"Dublin, Ireland\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2022.findings-acl.5\",\n    doi = \"10.18653/v1/2022.findings-acl.5\",\n    pages = \"45--57\",\n    abstract = \"Despite the importance of relation extraction in building and representing knowledge, less research is focused on generalizing to unseen relations types. We introduce the task setting of Zero-Shot Relation Triplet Extraction (ZeroRTE) to encourage further research in low-resource relation extraction methods. Given an input sentence, each extracted triplet consists of the head entity, relation label, and tail entity where the relation label is not seen at the training stage. To solve ZeroRTE, we propose to synthesize relation examples by prompting language models to generate structured texts. Concretely, we unify language model prompts and structured text approaches to design a structured prompt template for generating synthetic relation samples when conditioning on relation label prompts (RelationPrompt). To overcome the limitation for extracting multiple relation triplets in a sentence, we design a novel Triplet Search Decoding method. Experiments on FewRel and Wiki-ZSL datasets show the efficacy of RelationPrompt for the ZeroRTE task and zero-shot relation classification. Our code and data are available at github.com/declare-lab/RelationPrompt.\",\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeclare-lab%2Frelationprompt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeclare-lab%2Frelationprompt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeclare-lab%2Frelationprompt/lists"}