{"id":19467484,"url":"https://github.com/thunlp/jointnre","last_synced_at":"2025-05-12T03:32:03.328Z","repository":{"id":89616743,"uuid":"111351969","full_name":"thunlp/JointNRE","owner":"thunlp","description":"Joint Neural Relation Extraction with Text and KGs","archived":false,"fork":false,"pushed_at":"2022-11-03T06:52:50.000Z","size":265,"stargazers_count":187,"open_issues_count":1,"forks_count":35,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-31T23:51:08.550Z","etag":null,"topics":["relation-extraction"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-11-20T02:25:47.000Z","updated_at":"2025-02-01T17:10:05.000Z","dependencies_parsed_at":"2024-01-27T18:22:38.711Z","dependency_job_id":"5ca8db64-0a52-4fba-98de-bebdaa7216c4","html_url":"https://github.com/thunlp/JointNRE","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FJointNRE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FJointNRE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FJointNRE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FJointNRE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thunlp","download_url":"https://codeload.github.com/thunlp/JointNRE/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253668087,"owners_count":21944977,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["relation-extraction"],"created_at":"2024-11-10T18:35:24.599Z","updated_at":"2025-05-12T03:32:02.985Z","avatar_url":"https://github.com/thunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JointNRE\n\nThis repository is a subproject of THU-OpenSK, and all subprojects of THU-OpenSK are as follows.\n\n- [OpenNE](https://www.github.com/thunlp/OpenNE)\n- [OpenKE](https://www.github.com/thunlp/OpenKE)\n  - [KB2E](https://www.github.com/thunlp/KB2E)\n  - [TensorFlow-Transx](https://www.github.com/thunlp/TensorFlow-Transx)\n  - [Fast-TransX](https://www.github.com/thunlp/Fast-TransX)\n- [OpenNRE](https://www.github.com/thunlp/OpenNRE)\n  - [JointNRE](https://www.github.com/thunlp/JointNRE)\n\nCodes and datasets for our paper \"Neural Knowledge Acquisition via Mutual Attention between Knowledge Graph and Text\"\n\n\nSome Introduction\n===\n\nThis implementation is a fast and stable version. \n\nWe have made some simplifications for the original model so that to train a joint model just needs around 15min.\n\nWe also encapsulate more neural architectures into our framework to encode sentences.\n\nThe code and datasets mainly for the task relation extraction.\n\nData\n==========\n\nWe provide the datasets used for the task relation extraction.\n\nNew York Times Corpus: The data used in relation extraction from text is published by \"Modeling relations and their mentions without labeled text\". The data should be obtained from [[LDC]](https://catalog.ldc.upenn.edu/LDC2008T19) first.\n\nDatasets are required in the folder data/ in the following format, containing at least 4 files:\n\n+ kg/train.txt: the knowledge graph for training, format (e1, e2, rel).\n\n+ text/relation2id.txt: the relation needed to be predicted for RE, format (rel, id).\n\n+ text/train.txt: the text for training, format (e1, e2, name1, name2, rel, sentence).\n\n+ text/vec.txt: the initial word embeddings.\n\n+ [[Download (Baidu Cloud)]](https://pan.baidu.com/s/1q7rctsoJ_YdlLa55yckwbQ)\n+ [[Download (Tsinghua Cloud)]](https://cloud.tsinghua.edu.cn/f/28ba8ac5262349dd9622/?dl=1)\n\nFor FB15K-NYT, we directly give the data for our code [[Download (Tsinghua Cloud)]](https://cloud.tsinghua.edu.cn/f/384836aacb1f4aee9fa3/?dl=1), as we cannot release the original data limited by the license of LDC.\n\nRun the experiments\n==========\n\n### To run the experiments, unpack the datasets first:\n\n```\nunzip origin_data.zip -d origin_data/\nmkdir data/\npython initial.py\n```\n\n### Run the corresponding python scripts to train models:\n\n```\ncd jointE\nbash make.sh\npython train.py\n```\n\n### Change the corresponding python code to set hyperparameters:\n\n```\ntf.app.flags.DEFINE_float('nbatch_kg',100,'entity numbers used each training time')\ntf.app.flags.DEFINE_float('margin',1.0,'entity numbers used each training time')\ntf.app.flags.DEFINE_float('learning_rate_kg',0.001,'learning rate for kg')\ntf.app.flags.DEFINE_float('ent_total',lib.getEntityTotal(),'total of entities')\ntf.app.flags.DEFINE_float('rel_total',lib.getRelationTotal(),'total of relations')\ntf.app.flags.DEFINE_float('tri_total',lib.getTripleTotal(),'total of triples')\ntf.app.flags.DEFINE_float('katt_flag', 1, '1 for katt, 0 for att')\n\ntf.app.flags.DEFINE_string('model', 'cnn', 'neural models to encode sentences')\ntf.app.flags.DEFINE_float('max_length',config['fixlen'],'maximum of number of words in one sentence')\ntf.app.flags.DEFINE_float('pos_num', config['maxlen'] * 2 + 1,'number of position embedding vectors')\ntf.app.flags.DEFINE_float('num_classes', config['textual_rel_total'],'maximum of relations')\n\ntf.app.flags.DEFINE_float('hidden_size',230,'hidden feature size')\ntf.app.flags.DEFINE_float('pos_size',5,'position embedding size')\n\ntf.app.flags.DEFINE_float('max_epoch',20,'maximum of training epochs')\ntf.app.flags.DEFINE_float('batch_size',160,'entity numbers used each training time')\ntf.app.flags.DEFINE_float('learning_rate',0.5,'learning rate for nn')\ntf.app.flags.DEFINE_float('weight_decay',0.00001,'weight_decay')\ntf.app.flags.DEFINE_float('keep_prob',0.5,'dropout rate')\n\ntf.app.flags.DEFINE_string('model_dir','./model/','path to store model')\ntf.app.flags.DEFINE_string('summary_dir','./summary','path to store summary_dir')\n```\n\n### Run the corresponding python scripts to test models:\n\n```\ncd jointE\nbash make.sh\npython test.py\n```\n\nNote that the hyperparameters in the train.py and the test.py must be the same.\n\n### Run the corresponding python script to get PR-curve results:\n\n```\ncd jointE\npython pr_plot.py\n```\n\nCitation\n===\n\n```\n @inproceedings{han2018neural,\n   title={Neural Knowledge Acquisition via Mutual Attention between Knowledge Graph and Text},\n   author={Han, Xu and Liu, Zhiyuan and Sun, Maosong},\n   booktitle={Proceedings of AAAI},\n   year={2018}\n }\n```\n\n\n\n \n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Fjointnre","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthunlp%2Fjointnre","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Fjointnre/lists"}