https://github.com/zjunlp/docunet

[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
https://github.com/zjunlp/docunet

docred document document-level document-level-relation-extraction docunet information-extraction pytorch pytorch-implementation re relation-extraction segmentation semantic-segmentation

Last synced: about 1 year ago
JSON representation

[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation

Host: GitHub
URL: https://github.com/zjunlp/docunet
Owner: zjunlp
License: mit
Created: 2021-05-07T06:03:52.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2022-12-06T19:39:55.000Z (over 3 years ago)
Last Synced: 2025-06-13T23:05:18.142Z (about 1 year ago)
Topics: docred, document, document-level, document-level-relation-extraction, docunet, information-extraction, pytorch, pytorch-implementation, re, relation-extraction, segmentation, semantic-segmentation
Language: Python
Homepage:
Size: 260 KB
Stars: 145
Watchers: 4
Forks: 22
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

# DocuNet

This repository is the official implementation of [**DocuNet**](https://github.com/zjunlp/DocRE/), which is model proposed in a paper: **[Document-level Relation Extraction as Semantic Segmentation](https://www.ijcai.org/proceedings/2021/551)**, accepted by **IJCAI2021** main conference. 

- ❗NOTE: Docunet is integrated in the knowledge extraction toolkit [DeepKE](https://github.com/zjunlp/DeepKE).

# Brief Introduction

This paper innovatively proposes the DocuNet model, which first regards the document-level relation extraction as the semantic segmentation task in computer vision.



# Requirements

To install requirements:

```setup

pip install -r requirements.txt

```

# Training

To train the DocuNet model in the paper on the dataset [DocRED](https://github.com/thunlp/DocRE), run this command:

```bash

>> bash scripts/run_docred.sh # use BERT/RoBERTa by setting --transformer-type

```

To train the DocuNet model in the paper on the dataset CDR and GDA, run this command:

```bash

>> bash scripts/run_cdr.sh  # for CDR

>> bash scripts/run_gda.sh  # for GDA

```

# Evaluation

To evaluate the trained model in the paper, you setting the `--load_path` argument in training scripts. The program will log the result of evaluation automatically. And for DocRED  it will generate a test file `result.json` in the official evaluation format. You can compress and submit it to Colab for the official test score.

# Results

Our model achieves the following performance on : 

## Document-level Relation Extraction on [DocRED](https://github.com/thunlp/DocRED)

| Model     | Ign F1 on Dev | F1 on Dev | Ign F1 on Test | F1 on Test |

| :----------------: |:--------------: | :------------: | ------------------ | ------------------ |

| DocuNet-BERT (base) |  59.86±0.13 |   61.83±0.19 |     59.93    |      61.86  |

| DocuNet-RoBERTa (large) | 62.23±0.12 | 64.12±0.14 | 62.39 | 64.55 |

## Document-level Relation Extraction on [CDR and GDA](https://github.com/fenchri/edge-oriented-graph)

| Model  |    CDR    | GDA |

| :----------------: | :----------------: | :----------------: |

| DocuNet-SciBERT (base) | 76.3±0.40    | 85.3±0.50  |

# Acknowledgement

Part of our code is borrowed from [https://github.com/wzhouad/ATLOP](https://github.com/wzhouad/ATLOP), many thanks.

You can refer to [https://github.com/fenchri/edge-oriented-graph](https://github.com/fenchri/edge-oriented-graph) for the detailed preprocessing process of GDA and CDR datasets (acquire the file of train_filter.data, dev_filter.data and test_filter.data).

# Papers for the Project & How to Cite

If you use or extend our work, please cite the paper as follows:

```

@inproceedings{ijcai2021-551,

  title     = {Document-level Relation Extraction as Semantic Segmentation},

  author    = {Zhang, Ningyu and Chen, Xiang and Xie, Xin and Deng, Shumin and Tan, Chuanqi and Chen, Mosha and Huang, Fei and Si, Luo and Chen, Huajun},

  booktitle = {Proceedings of the Thirtieth International Joint Conference on

               Artificial Intelligence, {IJCAI-21}},

  publisher = {International Joint Conferences on Artificial Intelligence Organization},

  editor    = {Zhi-Hua Zhou},

  pages     = {3999--4006},

  year      = {2021},

  month     = {8},

  note      = {Main Track}

  doi       = {10.24963/ijcai.2021/551},

  url       = {https://doi.org/10.24963/ijcai.2021/551},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zjunlp/docunet

Awesome Lists containing this project

README