Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ianycxu/GCN-with-BERT
Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]
https://github.com/ianycxu/GCN-with-BERT
bert bert-model coreference-resolution gcn gnn graph-convolutional-networks graph-neural-networks nlp pytorch
Last synced: 3 months ago
JSON representation
Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]
- Host: GitHub
- URL: https://github.com/ianycxu/GCN-with-BERT
- Owner: ianycxu
- Created: 2019-04-28T00:18:25.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2021-05-18T01:01:57.000Z (over 3 years ago)
- Last Synced: 2024-08-11T16:09:16.857Z (6 months ago)
- Topics: bert, bert-model, coreference-resolution, gcn, gnn, graph-convolutional-networks, graph-neural-networks, nlp, pytorch
- Language: Jupyter Notebook
- Homepage:
- Size: 959 KB
- Stars: 141
- Watchers: 1
- Forks: 27
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-bert - ianycxu/RGCN-with-BERT - Relational Graph Convolutional Networks (RGCN) with BERT for Coreference Resolution Task (BERT Coreference Resolution)
- awesome-gcn - ianycxu/RGCN-with-BERT
README
# Look Again at the Syntax: Relational Graph Convolutional Network for Gendered Ambiguous Pronoun Resolution
## Original Paper
https://www.aclweb.org/anthology/W19-3814/## Introduction
We propose an end-to-end resolver by combining pre-trained BERT with Relational Graph Convolutional Network (R-GCN). R-GCN is used for digesting structural syntactic information and learning better task-specific embeddings. Empirical results demonstrate that, under explicit syntactic supervision and without the need to fine tune BERT, R-GCN's embeddings outperform the original BERT embeddings on the coreference task. Our work obtains the state-of-the-art results on GAP dataset, and significantly improves the snippet-context baseline F1 score from 66.9% to 80.3%. We participated in the 2019 GAP Coreference Shared Task, and our codes are available online. The overall architecture is shown below.
![](https://i.imgur.com/aAK43SM.png)## Dataset we have
The data set is Gendered Ambiguous Pronouns (GAP), which is a gender-balanced dataset containing 8908 coreference-labeled pairs sampled from Wikipedia. The dataset contains samples Each sample contains a small paragraph that mentions the potential subject's names later refered by a target pronoun. It also came up with two candidate names for the resolver to choose from. Columns contains:| Header | Description |
| :------------- | :----------: |
| ID | ID for this sample |
| Text | Text containing pronoun and two names |
| Pronoun | Target pronoun in text |
| Pronoun-offset | Character offset in text |
| A | Name A in text |
| A-offset | Position of A in the text |
| A-coref | Whether A confers this pronoun |
| B | Name B in text |
| B-offset | Position of B in the text |
| A-coref | Whether B confers this pronoun |## Data Preprocessing
We use SpaCy as our syntactic denpendency parser. DGL is used to transfer each dependency tree into a graph object. This DGL graph object then can be used as the input for GCN model which is also implemented by DGL. Several graphs are grouped together as a larger DGL batch-graph object for batch training setting.