{"id":13754306,"url":"https://github.com/zjunlp/deepke","last_synced_at":"2025-05-12T13:20:36.417Z","repository":{"id":37276339,"uuid":"143090423","full_name":"zjunlp/DeepKE","owner":"zjunlp","description":"[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction","archived":false,"fork":false,"pushed_at":"2025-04-22T07:35:10.000Z","size":127075,"stargazers_count":3871,"open_issues_count":0,"forks_count":716,"subscribers_count":45,"default_branch":"main","last_synced_at":"2025-04-23T17:13:27.663Z","etag":null,"topics":["attribute-extraction","chinese","deep-learning","deepke","document-level","few-shot","information-extraction","instructie","kg","knowledge-graph","knowprompt","lightner","low-resource","multi-modal","named-entity-recognition","ner","nlp","prompt","pytorch","relation-extraction"],"latest_commit_sha":null,"homepage":"http://deepke.zjukg.cn/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zjunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-08-01T01:54:52.000Z","updated_at":"2025-04-23T17:09:17.000Z","dependencies_parsed_at":"2023-02-19T05:01:26.428Z","dependency_job_id":"b0930023-9d66-4e24-8f4a-a185fe003148","html_url":"https://github.com/zjunlp/DeepKE","commit_stats":{"total_commits":940,"total_committers":20,"mean_commits":47.0,"dds":0.7925531914893618,"last_synced_commit":"018ab988735153c8dee50d20cc6f0b3fe7c31d99"},"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FDeepKE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FDeepKE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FDeepKE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FDeepKE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zjunlp","download_url":"https://codeload.github.com/zjunlp/DeepKE/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253745197,"owners_count":21957320,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attribute-extraction","chinese","deep-learning","deepke","document-level","few-shot","information-extraction","instructie","kg","knowledge-graph","knowprompt","lightner","low-resource","multi-modal","named-entity-recognition","ner","nlp","prompt","pytorch","relation-extraction"],"created_at":"2024-08-03T09:01:53.884Z","updated_at":"2025-05-12T13:20:36.391Z","avatar_url":"https://github.com/zjunlp.png","language":"Python","readme":"\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/zjunlp/deepke\"\u003e \u003cimg src=\"pics/logo.png\" width=\"400\"/\u003e\u003c/a\u003e\n\u003cp\u003e\n\u003cp align=\"center\"\u003e  \n    \u003ca href=\"http://deepke.zjukg.cn\"\u003e\n        \u003cimg alt=\"Documentation\" src=\"https://img.shields.io/badge/demo-website-blue\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/deepke/#files\"\u003e\n        \u003cimg alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/deepke\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/zjunlp/DeepKE/blob/master/LICENSE\"\u003e\n        \u003cimg alt=\"GitHub\" src=\"https://img.shields.io/github/license/zjunlp/deepke\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"http://zjunlp.github.io/DeepKE\"\u003e\n        \u003cimg alt=\"Documentation\" src=\"https://img.shields.io/badge/doc-website-red\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://colab.research.google.com/drive/1vS8YJhJltzw3hpJczPt24O0Azcs3ZpRi?usp=sharing\"\u003e\n        \u003cimg alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\n\u003cp align=\"center\"\u003e\n    \u003cb\u003e English | \u003ca href=\"https://github.com/zjunlp/DeepKE/blob/main/README_CN.md\"\u003e简体中文\u003c/a\u003e \u003c/b\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003e\n    \u003cp\u003eA Deep Learning Based Knowledge Extraction Toolkit\u003cbr\u003efor Knowledge Graph Construction\u003c/p\u003e\n\u003c/h1\u003e\n\n\n[DeepKE](https://arxiv.org/pdf/2201.03335.pdf) is a knowledge extraction toolkit for knowledge graph construction supporting **cnSchema**，**low-resource**, **document-level** and **multimodal** scenarios for *entity*, *relation* and *attribute* extraction. We provide [documents](https://zjunlp.github.io/DeepKE/), [online demo](http://deepke.zjukg.cn/), [paper](https://arxiv.org/pdf/2201.03335.pdf), [slides](https://drive.google.com/file/d/1IIeIZAbVduemqXc4zD40FUMoPHCJinLy/view?usp=sharing) and [poster](https://drive.google.com/file/d/1vd7xVHlWzoAxivN4T5qKrcqIGDcSM1_7/view?usp=sharing) for beginners.\n\n- ❗Want to use **Large Language Models** with DeepKE? Try [DeepKE-LLM](https://github.com/zjunlp/DeepKE/tree/main/example/llm) and [OneKE](https://github.com/zjunlp/DeepKE/blob/main/example/llm/OneKE.md), have fun!\n- ❗Want to train supervised models? Try [Quick Start](#quick-start), we provide the NER models (e.g, [LightNER(COLING'22)](https://github.com/zjunlp/DeepKE/tree/main/example/ner/few-shot), [W2NER(AAAI'22)](https://github.com/zjunlp/DeepKE/tree/main/example/ner/standard/w2ner)), relation extraction models (e.g., [KnowPrompt(WWW'22)](https://github.com/zjunlp/DeepKE/tree/main/example/re/few-shot)), relational triple extraction models (e.g., [ASP(EMNLP'22)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/ASP), [PRGC(ACL'21)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/PRGC), [PURE(NAACL'21)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/PURE)), and release off-the-shelf  models at [DeepKE-cnSchema](https://github.com/zjunlp/DeepKE/tree/main/example/triple/cnschema), have fun!\n- We recommend using Linux; if using Windows, please use `\\\\` in file paths;\n- If HuggingFace is inaccessible, please consider using `wisemodel` or `modescape`.\n\n**If you encounter any issues during the installation of DeepKE and DeepKE-LLM, please check [Tips](https://github.com/zjunlp/DeepKE#tips) or promptly submit an [issue](https://github.com/zjunlp/DeepKE/issues), and we will assist you with resolving the problem!**\n\n\n# Table of Contents\n\n- [Table of Contents](#table-of-contents)\n- [What's New](#whats-new)\n- [Prediction Demo](#prediction-demo)\n- [Model Framework](#model-framework)\n- [Quick Start](#quick-start)\n  - [DeepKE-LLM](#deepke-llm)\n  - [DeepKE](#deepke)\n      - [🔧Manual Environment Configuration](#manual-environment-configuration)\n      - [🐳Building With Docker Images](#building-with-docker-images)\n  - [Requirements](#requirements)\n    - [DeepKE](#deepke-1)\n  - [Introduction of Three Functions](#introduction-of-three-functions)\n    - [1. Named Entity Recognition](#1-named-entity-recognition)\n    - [2. Relation Extraction](#2-relation-extraction)\n    - [3. Attribute Extraction](#3-attribute-extraction)\n    - [4. Event Extraction](#4-event-extraction)\n- [Tips](#tips)\n- [To do](#to-do)\n- [Reading Materials](#reading-materials)\n- [Related Toolkit](#related-toolkit)\n- [Citation](#citation)\n- [Contributors](#contributors)\n- [Other Knowledge Extraction Open-Source Projects](#other-knowledge-extraction-open-source-projects)\n\n\u003cbr\u003e\n\n# What's New\n* `December, 2024` We open source the [OneKE](https://github.com/zjunlp/OneKE/tree/main) knowledge extraction framework, supporting multi-agent knowledge extraction across various scenarios.\n* `April, 2024` We release a new bilingual (Chinese and English) schema-based information extraction model called [OneKE](https://huggingface.co/zjunlp/OneKE) based on Chinese-Alpaca-2-13B.\n* `Feb, 2024` We release a large-scale (0.32B tokens) high-quality bilingual (Chinese and English) Information Extraction (IE) instruction dataset named [IEPile](https://huggingface.co/datasets/zjunlp/iepie), along with two models trained with `IEPile`, [baichuan2-13b-iepile-lora](https://huggingface.co/zjunlp/baichuan2-13b-iepile-lora) and [llama2-13b-iepile-lora](https://huggingface.co/zjunlp/llama2-13b-iepile-lora).\n* `Sep 2023` a bilingual Chinese English Information Extraction (IE) instruction dataset called  `InstructIE` was released for the Instruction based Knowledge Graph Construction Task (Instruction based KGC), as detailed in [here](./example/llm/README.md/#data).\n* `June, 2023` We update [DeepKE-LLM](https://github.com/zjunlp/DeepKE/tree/main/example/llm) to support **knowledge extraction** with [KnowLM](https://github.com/zjunlp/KnowLM), [ChatGLM](https://github.com/THUDM/ChatGLM-6B), LLaMA-series, GPT-series etc.\n* `Apr, 2023` We have added new models, including [CP-NER(IJCAI'23)](https://github.com/zjunlp/DeepKE/blob/main/example/ner/cross), [ASP(EMNLP'22)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/ASP), [PRGC(ACL'21)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/PRGC), [PURE(NAACL'21)](https://github.com/zjunlp/DeepKE/tree/main/example/triple/PURE), provided [event extraction](https://github.com/zjunlp/DeepKE/tree/main/example/ee/standard) capabilities (Chinese and English), and offered compatibility with higher versions of Python packages (e.g., Transformers).\n* `Feb, 2023` We have supported using [LLM](https://github.com/zjunlp/DeepKE/tree/main/example/llm) (GPT-3) with in-context learning (based on [EasyInstruct](https://github.com/zjunlp/EasyInstruct)) \u0026 data generation, added a NER model [W2NER(AAAI'22)](https://github.com/zjunlp/DeepKE/tree/main/example/ner/standard/w2ner).\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePrevious News\u003c/b\u003e\u003c/summary\u003e\n\n* `Nov, 2022` Add data [annotation instructions](https://github.com/zjunlp/DeepKE/blob/main/README_TAG.md) for entity recognition and relation extraction, automatic labelling of weakly supervised data ([entity extraction](https://github.com/zjunlp/DeepKE/tree/main/example/ner/prepare-data) and [relation extraction](https://github.com/zjunlp/DeepKE/tree/main/example/re/prepare-data)), and optimize [multi-GPU training](https://github.com/zjunlp/DeepKE/tree/main/example/re/standard).\n  \n* `Sept, 2022` The paper [DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population](https://arxiv.org/abs/2201.03335) has been accepted by the EMNLP 2022 System Demonstration Track.\n\n* `Aug, 2022` We have added [data augmentation](https://github.com/zjunlp/DeepKE/tree/main/example/re/few-shot/DA) (Chinese, English) support for [low-resource relation extraction](https://github.com/zjunlp/DeepKE/tree/main/example/re/few-shot).\n\n* `June, 2022` We have added multimodal support for [entity](https://github.com/zjunlp/DeepKE/tree/main/example/ner/multimodal) and [relation extraction](https://github.com/zjunlp/DeepKE/tree/main/example/re/multimodal).\n\n* `May, 2022` We have released [DeepKE-cnschema](https://github.com/zjunlp/DeepKE/blob/main/README_CNSCHEMA.md) with off-the-shelf knowledge extraction models.\n\n* `Jan, 2022` We have released a paper [DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population](https://arxiv.org/abs/2201.03335)\n\n* `Dec, 2021` We have added `dockerfile` to create the enviroment automatically. \n\n* `Nov, 2021` The demo of DeepKE, supporting real-time extration without deploying and training, has been released.\n* The documentation of DeepKE, containing the details of DeepKE such as source codes and datasets, has been released.\n\n* `Oct, 2021` `pip install deepke`\n* The codes of deepke-v2.0 have been released.\n\n* `Aug, 2019` The codes of deepke-v1.0 have been released.\n\n* `Aug, 2018` The project DeepKE startup and codes of deepke-v0.1 have been released.\n  \n\n\u003c/details\u003e\n\n# Prediction Demo\n\nThere is a demonstration of prediction. The GIF file is created by [Terminalizer](https://github.com/faressoft/terminalizer). Get the [code](https://drive.google.com/file/d/1r4tWfAkpvynH3CBSgd-XG79rf-pB-KR3/view?usp=share_link).\n\u003cimg src=\"pics/demo.gif\" width=\"636\" height=\"494\" align=center\u003e\n\n\u003cbr\u003e\n\n# Model Framework\n\n\u003ch3 align=\"center\"\u003e\n    \u003cimg src=\"pics/architectures.png\"\u003e\n\u003c/h3\u003e\n\n\n- DeepKE contains a unified framework for **named entity recognition**, **relation extraction** and **attribute extraction**, the three  knowledge extraction functions.\n- Each task can be implemented in different scenarios. For example, we can achieve relation extraction in **standard**, **low-resource (few-shot)**, **document-level** and **multimodal** settings.\n- Each application scenario comprises of three components: **Data** including Tokenizer, Preprocessor and Loader, **Model** including Module, Encoder and Forwarder, **Core** including Training, Evaluation and Prediction. \n\n\u003cbr\u003e\n\n# Quick Start\n\n## DeepKE-LLM\n\nIn the era of large models, DeepKE-LLM utilizes a completely new environment dependency.\n\n```\nconda create -n deepke-llm python=3.9\nconda activate deepke-llm\n\ncd example/llm\npip install -r requirements.txt\n```\n\nPlease note that the `requirements.txt` file is located in the `example/llm` folder.\n\n## DeepKE\n- *DeepKE* supports `pip install deepke`. \u003cbr\u003eTake the fully supervised relation extraction for example.\n- *DeepKE* supports both **manual** and **docker image** environment configuration, you can choose the appropriate way to build.\n- Highly recommended to install deepke in a Linux environment.\n#### 🔧Manual Environment Configuration\n\n**Step1** Download the basic code\n\n```bash\ngit clone --depth 1 https://github.com/zjunlp/DeepKE.git\n```\n\n**Step2** Create a virtual environment using `Anaconda` and enter it.\u003cbr\u003e\n\n```bash\nconda create -n deepke python=3.8\n\nconda activate deepke\n```\n\n1. Install *DeepKE* with source code\n\n   ```bash\n   pip install -r requirements.txt\n   \n   python setup.py install\n   \n   python setup.py develop\n   ```\n\n2. Install *DeepKE* with `pip` (**NOT recommended!**)\n\n   ```bash\n   pip install deepke\n   ```\n   - Please make sure that pip version \u003c= 24.0\n\n**Step3** Enter the task directory\n\n```bash\ncd DeepKE/example/re/standard\n```\n\n**Step4** Download the dataset, or follow the [annotation instructions](https://github.com/zjunlp/DeepKE/blob/main/README_TAG.md) to obtain data\n\n```bash\nwget 121.41.117.246:8080/Data/re/standard/data.tar.gz\n\ntar -xzvf data.tar.gz\n```\n\nMany types of data formats are supported,and details are in each part. \n\n**Step5** Training (Parameters for training can be changed in the `conf` folder)\n\nWe support visual parameter tuning by using *[wandb](https://docs.wandb.ai/quickstart)*.\n\n```bash\npython run.py\n```\n\n**Step6** Prediction (Parameters for prediction can be changed in the `conf` folder)\n\nModify the path of the trained model in `predict.yaml`.The absolute path of the model needs to be used，such as `xxx/checkpoints/2019-12-03_ 17-35-30/cnn_ epoch21.pth`.\n\n```bash\npython predict.py\n```\n\n - **❗NOTE: if you encounter any errors, please refer to the [Tips](#tips) or submit a GitHub issue.**\n\n\n\n#### 🐳Building With Docker Images\n**Step1** Install the Docker client\n\nInstall Docker and start the Docker service.\n\n**Step2** Pull the docker image and run the container\n\n```bash\ndocker pull zjunlp/deepke:latest\ndocker run -it zjunlp/deepke:latest /bin/bash\n```\n\nThe remaining steps are the same as **Step 3 and onwards** in **Manual Environment Configuration**.\n\n - **❗NOTE: You can refer to the [Tips](#tips) to speed up installation**\n\n## Requirements\n\n\n### DeepKE\n\u003e python == 3.8\n\n- torch\u003e=1.5,\u003c=1.11\n- hydra-core==1.0.6\n- tensorboard==2.4.1\n- matplotlib==3.4.1\n- transformers==4.26.0\n- jieba==0.42.1\n- scikit-learn==0.24.1\n- seqeval==1.2.2\n- opt-einsum==3.3.0\n- wandb==0.12.7\n- ujson==5.6.0\n- huggingface_hub==0.11.0\n- tensorboardX==2.5.1\n- nltk==3.8\n- protobuf==3.20.1\n- numpy==1.21.0\n- ipdb==0.13.11\n- pytorch-crf==0.7.2\n- tqdm==4.66.1\n- openai==0.28.0\n- Jinja2==3.1.2\n- datasets==2.13.2\n- pyhocon==0.3.60\n\n\u003cbr\u003e\n\n## Introduction of Three Functions\n\n### 1. Named Entity Recognition\n\n- Named entity recognition seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, organizations, etc.\n\n- The data is stored in `.txt` files. Some instances as following (Users can label data based on the tools [Doccano](https://github.com/doccano/doccano), [MarkTool](https://github.com/FXLP/MarkTool), or they can use the [Weak Supervision](https://github.com/zjunlp/DeepKE/blob/main/example/ner/prepare-data) with DeepKE to obtain data automatically):\n\n  |                           Sentence                           |           Person           |    Location    |          Organization          |\n  | :----------------------------------------------------------: | :------------------------: | :------------: | :----------------------------: |\n  | 本报北京9月4日讯记者杨涌报道：部分省区人民日报宣传发行工作座谈会9月3日在4日在京举行。 |            杨涌            |      北京      |            人民日报            |\n  | 《红楼梦》由王扶林导演，周汝昌、王蒙、周岭等多位专家参与制作。 | 王扶林，周汝昌，王蒙，周岭 |            |  |\n  | 秦始皇兵马俑位于陕西省西安市,是世界八大奇迹之一。 |           秦始皇           | 陕西省，西安市 |                          |\n\n- Read the detailed process in specific README\n  - **[STANDARD (Fully Supervised)](https://github.com/zjunlp/DeepKE/tree/main/example/ner/standard)**\n    \n    ***We [support LLM](https://github.com/zjunlp/DeepKE/tree/main/example/llm) and provide the off-the-shelf model, [DeepKE-cnSchema-NER](https://github.com/zjunlp/DeepKE/blob/main/README_CNSCHEMA_CN.md), which will extract entities in cnSchema without training.***\n\n    **Step1** Enter  `DeepKE/example/ner/standard`.  Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/ner/standard/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    **Step2** Training\u003cbr\u003e\n\n    The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n  \n    ```bash\n    python run.py\n    ```\n\n    **Step3** Prediction\n\n    ```bash\n    python predict.py\n    ```\n  \n  - **[FEW-SHOT](https://github.com/zjunlp/DeepKE/tree/main/example/ner/few-shot)**\n\n    **Step1** Enter  `DeepKE/example/ner/few-shot`.  Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/ner/few_shot/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n  \n    **Step2** Training in the low-resouce setting \u003cbr\u003e\n  \n    The directory where the model is loaded and saved and the configuration parameters can be cusomized in the `conf` folder.\n  \n    ```bash\n    python run.py +train=few_shot\n    ```\n    \n    Users can modify `load_path` in `conf/train/few_shot.yaml` to use existing loaded model.\u003cbr\u003e\n    \n    **Step3** Add `- predict` to `conf/config.yaml`, modify `loda_path` as the model path and `write_path` as the path where the predicted results are saved in `conf/predict.yaml`, and then run `python predict.py`\n    \n    ```bash\n    python predict.py\n    ```\n\n  - **[MULTIMODAL](https://github.com/zjunlp/DeepKE/tree/main/example/ner/multimodal)**\n\n    **Step1** Enter  `DeepKE/example/ner/multimodal`.  Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/ner/multimodal/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    We use RCNN detected objects and visual grounding objects from original images as visual local information, where RCNN via [faster_rcnn](https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py) and visual grounding via [onestage_grounding](https://github.com/zyang-ur/onestage_grounding).\n\n    **Step2** Training in the multimodal setting \u003cbr\u003e\n\n    - The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n    - Start with the model trained last time: modify `load_path` in `conf/train.yaml`as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by `log_dir`.\n\n    ```bash\n    python run.py\n    ```\n\n    **Step3** Prediction\n\n    ```bash\n    python predict.py\n    ```\n\n### 2. Relation Extraction\n\n- Relationship extraction is the task of extracting semantic relations between entities from a unstructured text.\n\n- The data is stored in `.csv` files. Some instances as following (Users can label data based on the tools [Doccano](https://github.com/doccano/doccano), [MarkTool](https://github.com/FXLP/MarkTool), or they can use the [Weak Supervision](https://github.com/zjunlp/DeepKE/blob/main/example/re/prepare-data) with DeepKE to obtain data automatically):\n\n  |                        Sentence                        | Relation |    Head    | Head_offset |    Tail    | Tail_offset |\n  | :----------------------------------------------------: | :------: | :--------: | :---------: | :--------: | :---------: |\n  | 《岳父也是爹》是王军执导的电视剧，由马恩然、范明主演。 |   导演   | 岳父也是爹 |      1      |    王军    |      8      |\n  |  《九玄珠》是在纵横中文网连载的一部小说，作者是龙马。  | 连载网站 |   九玄珠   |      1      | 纵横中文网 |      7      |\n  |     提起杭州的美景，西湖总是第一个映入脑海的词语。     | 所在城市 |    西湖    |      8      |    杭州    |      2      |\n\n- **!NOTE: If there are multiple entity types for one relation, entity types can be prefixed with the relation as inputs.**\n- Read the detailed process in specific README\n\n  - **[STANDARD (Fully Supervised)](https://github.com/zjunlp/DeepKE/tree/main/example/re/standard)** \n\n    ***We [support LLM](https://github.com/zjunlp/DeepKE/tree/main/example/llm) and provide the off-the-shelf model, [DeepKE-cnSchema-RE](https://github.com/zjunlp/DeepKE/blob/main/README_CNSCHEMA_CN.md), which will extract relations in cnSchema without training.***\n\n    **Step1** Enter the `DeepKE/example/re/standard` folder.  Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/re/standard/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    **Step2** Training\u003cbr\u003e\n\n    The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n  \n    ```bash\n    python run.py\n    ```\n\n    **Step3** Prediction\n\n    ```bash\n    python predict.py\n    ```\n  \n  - **[FEW-SHOT](https://github.com/zjunlp/DeepKE/tree/main/example/re/few-shot)**\n\n    **Step1** Enter `DeepKE/example/re/few-shot`. Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/re/few_shot/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    **Step 2** Training\u003cbr\u003e\n\n    - The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n    - Start with the model trained last time: modify `train_from_saved_model` in `conf/train.yaml`as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by `log_dir`. \n  \n    ```bash\n    python run.py\n    ```\n  \n    **Step3** Prediction\n  \n    ```bash\n    python predict.py\n    ```\n  \n  - **[DOCUMENT](https://github.com/zjunlp/DeepKE/tree/main/example/re/document)**\u003cbr\u003e\n  \n    **Step1** Enter `DeepKE/example/re/document`.  Download the dataset.\n  \n    ```bash\n    wget 121.41.117.246:8080/Data/re/document/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n    \n    **Step2** Training\u003cbr\u003e\n  \n    - The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n    - Start with the model trained last time: modify `train_from_saved_model` in `conf/train.yaml`as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by `log_dir`. \n    \n    ```bash\n    python run.py\n    ```\n    \n    **Step3** Prediction\n    \n    ```bash\n    python predict.py\n    ```\n\n  - **[MULTIMODAL](https://github.com/zjunlp/DeepKE/tree/main/example/re/multimodal)**\n\n    **Step1** Enter  `DeepKE/example/re/multimodal`.  Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/re/multimodal/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    We use RCNN detected objects and visual grounding objects from original images as visual local information, where RCNN via [faster_rcnn](https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py) and visual grounding via [onestage_grounding](https://github.com/zyang-ur/onestage_grounding).\n\n    **Step2** Training\u003cbr\u003e\n\n    - The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n    - Start with the model trained last time: modify `load_path` in `conf/train.yaml`as the path where the model trained last time was saved. And the path saving logs generated in training can be customized by `log_dir`.\n\n    ```bash\n    python run.py\n    ```\n\n    **Step3** Prediction\n\n    ```bash\n    python predict.py\n    ```\n\n### 3. Attribute Extraction\n\n- Attribute extraction is to extract attributes for entities in a unstructed text.\n\n- The data is stored in `.csv` files. Some instances as following:\n\n  |                           Sentence                           |   Att    |   Ent    | Ent_offset |      Val      | Val_offset |\n  | :----------------------------------------------------------: | :------: | :------: | :--------: | :-----------: | :--------: |\n  |          张冬梅，女，汉族，1968年2月生，河南淇县人           |   民族   |  张冬梅  |     0      |     汉族      |     6      |\n  |诸葛亮，字孔明，三国时期杰出的军事家、文学家、发明家。|   朝代   |   诸葛亮   |     0      |     三国时期      |     8     |\n  |        2014年10月1日许鞍华执导的电影《黄金时代》上映         | 上映时间 | 黄金时代 |     19     | 2014年10月1日 |     0      |\n\n- Read the detailed process in specific README\n  - **[STANDARD (Fully Supervised)](https://github.com/zjunlp/DeepKE/tree/main/example/ae/standard)**\n\n    **Step1** Enter the `DeepKE/example/ae/standard` folder. Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/ae/standard/data.tar.gz\n    \n    tar -xzvf data.tar.gz\n    ```\n\n    **Step2** Training\u003cbr\u003e\n\n    The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n    \n    ```bash\n    python run.py\n    ```\n    \n    **Step3** Prediction\n    \n    ```bash\n    python predict.py\n    ```\n\n\u003cbr\u003e\n\n### 4. Event Extraction\n\n* Event extraction is the task to extract event type, event trigger words, event arguments from a unstructed text.\n* The data is stored in `.tsv` files, some instances are as follows:\n\n\u003ctable h style=\"text-align:center\"\u003e\n    \u003ctr\u003e\n        \u003cth colspan=\"2\"\u003e Sentence \u003c/th\u003e\n        \u003cth\u003e Event type \u003c/th\u003e\n        \u003cth\u003e Trigger \u003c/th\u003e\n        \u003cth\u003e Role \u003c/th\u003e\n        \u003cth\u003e Argument \u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd rowspan=\"3\" colspan=\"2\"\u003e 据《欧洲时报》报道，当地时间27日，法国巴黎卢浮宫博物馆员工因不满工作条件恶化而罢工，导致该博物馆也因此闭门谢客一天。 \u003c/td\u003e\n      \t\u003ctd rowspan=\"3\"\u003e 组织行为-罢工 \u003c/td\u003e\n    \t\t\u003ctd rowspan=\"3\"\u003e 罢工 \u003c/td\u003e\n    \t\t\u003ctd\u003e 罢工人员 \u003c/td\u003e\n    \t\t\u003ctd\u003e 法国巴黎卢浮宫博物馆员工 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd\u003e 时间 \u003c/td\u003e\n        \u003ctd\u003e 当地时间27日 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd\u003e 所属组织 \u003c/td\u003e\n        \u003ctd\u003e 法国巴黎卢浮宫博物馆 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd rowspan=\"3\" colspan=\"2\"\u003e 中国外运2019年上半年归母净利润增长17%：收购了少数股东股权 \u003c/td\u003e\n      \t\u003ctd rowspan=\"3\"\u003e 财经/交易-出售/收购 \u003c/td\u003e\n    \t\t\u003ctd rowspan=\"3\"\u003e 收购 \u003c/td\u003e\n    \t\t\u003ctd\u003e 出售方 \u003c/td\u003e\n    \t\t\u003ctd\u003e 少数股东 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd\u003e 收购方 \u003c/td\u003e\n        \u003ctd\u003e 中国外运 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd\u003e 交易物 \u003c/td\u003e\n        \u003ctd\u003e 股权 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd rowspan=\"3\" colspan=\"2\"\u003e 美国亚特兰大航展13日发生一起表演机坠机事故，飞行员弹射出舱并安全着陆，事故没有造成人员伤亡。 \u003c/td\u003e\n      \t\u003ctd rowspan=\"3\"\u003e 灾害/意外-坠机 \u003c/td\u003e\n    \t\t\u003ctd rowspan=\"3\"\u003e 坠机 \u003c/td\u003e\n    \t\t\u003ctd\u003e 时间 \u003c/td\u003e\n    \t\t\u003ctd\u003e 13日 \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e \n        \u003ctd\u003e 地点 \u003c/td\u003e\n        \u003ctd\u003e 美国亚特兰 \u003c/td\u003e\n  \t\u003c/tr\u003e\n\u003c/table\u003e\n\n* Read the detailed process in specific README\n\n  * [STANDARD(Fully Supervised)](./example/ee/standard/README.md)\n\n    **Step1** Enter the `DeepKE/example/ee/standard` folder. Download the dataset.\n\n    ```bash\n    wget 121.41.117.246:8080/Data/ee/DuEE.zip\n    unzip DuEE.zip\n    ```\n\n    **Step 2** Training\n\n    The dataset and parameters can be customized in the `data` folder and `conf` folder respectively.\n\n    ```bash\n    python run.py\n    ```\n\n    **Step 3** Prediction\n\n    ```bash\n    python predict.py\n    ```\n\n\u003cbr\u003e\n\n# Tips\n\n1.```Using nearest mirror```, **[THU](https://mirrors.tuna.tsinghua.edu.cn/help/anaconda/) in China, will speed up the installation of *Anaconda*; [aliyun](http://mirrors.aliyun.com/pypi/simple/) in China, will speed up `pip install XXX`**.\n\n2.When encountering `ModuleNotFoundError: No module named 'past'`，run `pip install future` .\n\n3.It's slow to install the pretrained language models online. Recommend download pretrained models before use and save them in the `pretrained` folder. Read `README.md` in every task directory to check the specific requirement for saving pretrained models.\n\n4.The old version of *DeepKE* is in the [deepke-v1.0](https://github.com/zjunlp/DeepKE/tree/deepke-v1.0) branch. Users can change the branch to use the old version. The old version has been totally transfered to the standard relation extraction ([example/re/standard](https://github.com/zjunlp/DeepKE/blob/main/example/re/standard/README.md)).\n\n5.If you want to modify the source code, it's recommended to install *DeepKE* with source codes. If not, the modification will not work. See [issue](https://github.com/zjunlp/DeepKE/issues/117)\n\n6.More related low-resource knowledge extraction  works can be found in [Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective](https://arxiv.org/pdf/2202.08063.pdf).\n\n7.Make sure the exact versions of requirements in `requirements.txt`.\n\n# To do\nIn next version, we plan to release a stronger LLM for KE. \n\nMeanwhile, we will offer long-term maintenance to **fix bugs**, **solve issues** and meet **new requests**. So if you have any problems, please put issues to us.\n\n# Reading Materials\n\nData-Efficient Knowledge Graph Construction, 高效知识图谱构建 ([Tutorial on CCKS 2022](http://sigkg.cn/ccks2022/?page_id=24)) \\[[slides](https://drive.google.com/drive/folders/1xqeREw3dSiw-Y1rxLDx77r0hGUvHnuuE)\\] \n\nEfficient and Robust Knowledge Graph Construction ([Tutorial on AACL-IJCNLP 2022](https://www.aacl2022.org/Program/tutorials)) \\[[slides](https://github.com/NLP-Tutorials/AACL-IJCNLP2022-KGC-Tutorial)\\] \n\nPromptKG Family: a Gallery of Prompt Learning \u0026 KG-related Research Works, Toolkits, and Paper-list [[Resources](https://github.com/zjunlp/PromptKG)\\] \n\nKnowledge Extraction in Low-Resource Scenarios: Survey and Perspective \\[[Survey](https://arxiv.org/abs/2202.08063)\\]\\[[Paper-list](https://github.com/zjunlp/Low-resource-KEPapers)\\]\n\n\n# Related Toolkit\n\n[Doccano](https://github.com/doccano/doccano)、[MarkTool](https://github.com/FXLP/MarkTool)、[LabelStudio](https://labelstud.io/ ): Data Annotation Toolkits\n\n[LambdaKG](https://github.com/zjunlp/PromptKG/tree/main/lambdaKG): A library and benchmark for PLM-based KG embeddings\n\n[EasyInstruct](https://github.com/zjunlp/EasyInstruct): An easy-to-use framework to instruct Large Language Models\n\n**Reading Materials**:\n\nData-Efficient Knowledge Graph Construction, 高效知识图谱构建 ([Tutorial on CCKS 2022](http://sigkg.cn/ccks2022/?page_id=24)) \\[[slides](https://drive.google.com/drive/folders/1xqeREw3dSiw-Y1rxLDx77r0hGUvHnuuE)\\] \n\nEfficient and Robust Knowledge Graph Construction ([Tutorial on AACL-IJCNLP 2022](https://www.aacl2022.org/Program/tutorials)) \\[[slides](https://github.com/NLP-Tutorials/AACL-IJCNLP2022-KGC-Tutorial)\\] \n\nPromptKG Family: a Gallery of Prompt Learning \u0026 KG-related Research Works, Toolkits, and Paper-list [[Resources](https://github.com/zjunlp/PromptKG)\\] \n\nKnowledge Extraction in Low-Resource Scenarios: Survey and Perspective \\[[Survey](https://arxiv.org/abs/2202.08063)\\]\\[[Paper-list](https://github.com/zjunlp/Low-resource-KEPapers)\\]\n\n\n**Related Toolkit**:\n\n[Doccano](https://github.com/doccano/doccano)、[MarkTool](https://github.com/FXLP/MarkTool)、[LabelStudio](https://labelstud.io/ ): Data Annotation Toolkits\n\n[LambdaKG](https://github.com/zjunlp/PromptKG/tree/main/lambdaKG): A library and benchmark for PLM-based KG embeddings\n\n[EasyInstruct](https://github.com/zjunlp/EasyInstruct): An easy-to-use framework to instruct Large Language Models\n\n# Citation\n\nPlease cite our paper if you use DeepKE in your work\n\n```bibtex\n@inproceedings{EMNLP2022_Demo_DeepKE,\n  author    = {Ningyu Zhang and\n               Xin Xu and\n               Liankuan Tao and\n               Haiyang Yu and\n               Hongbin Ye and\n               Shuofei Qiao and\n               Xin Xie and\n               Xiang Chen and\n               Zhoubo Li and\n               Lei Li},\n  editor    = {Wanxiang Che and\n               Ekaterina Shutova},\n  title     = {DeepKE: {A} Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population},\n  booktitle = {{EMNLP} (Demos)},\n  pages     = {98--108},\n  publisher = {Association for Computational Linguistics},\n  year      = {2022},\n  url       = {https://aclanthology.org/2022.emnlp-demos.10}\n}\n```\n\u003cbr\u003e\n\n# Contributors\n\n[Ningyu Zhang](https://person.zju.edu.cn/en/ningyu), [Haofen Wang](https://tjdi.tongji.edu.cn/TeacherDetail.do?id=4991\u0026lang=_en), Fei Huang, Feiyu Xiong, Liankuan Tao, Xin Xu, Honghao Gui,  Zhenru Zhang, Chuanqi Tan, Qiang Chen, Xiaohan Wang, Zekun Xi, Xinrong Li, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Peng Wang, Yuqi Zhu, Xin Xie, Xiang Chen, Zhoubo Li, Lei Li, Xiaozhuan Liang, Yunzhi Yao, Jing Chen, Yuqi Zhu, Shumin Deng, Wen Zhang, Guozhou Zheng, Huajun Chen\n\nCommunity Contributors: [thredreams](https://github.com/thredreams), [eltociear](https://github.com/eltociear), Ziwen Xu, Rui Huang, Xiaolong Weng\n\n# Other Knowledge Extraction Open-Source Projects\n\n- [CogIE](https://github.com/jinzhuoran/CogIE)\n- [OpenNRE](https://github.com/thunlp/OpenNRE)\n- [OmniEvent](https://github.com/THU-KEG/OmniEvent)\n- [OpenUE](https://github.com/zjunlp/OpenUE)\n- [OpenIE](https://stanfordnlp.github.io/CoreNLP/openie.html)\n- [RESIN](https://github.com/RESIN-KAIROS/RESIN-pipeline-public)\n- [ZShot](https://github.com/IBM/zshot)\n- [ZS4IE](https://github.com/BBN-E/ZS4IE)\n- [OmniEvent](https://github.com/THU-KEG/OmniEvent)\n","funding_links":[],"categories":["知识图谱"],"sub_categories":["其他_文本生成、文本对话"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fdeepke","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzjunlp%2Fdeepke","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fdeepke/lists"}