Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ianycxu/GCN-with-BERT

Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]
https://github.com/ianycxu/GCN-with-BERT

bert bert-model coreference-resolution gcn gnn graph-convolutional-networks graph-neural-networks nlp pytorch

Last synced: about 1 month ago
JSON representation

Graph Convolutional Networks (GCN) with BERT for Coreference Resolution Task [Pytorch][DGL]

Awesome Lists containing this project

README

        

# Look Again at the Syntax: Relational Graph Convolutional Network for Gendered Ambiguous Pronoun Resolution

## Original Paper
https://www.aclweb.org/anthology/W19-3814/

## Introduction
We propose an end-to-end resolver by combining pre-trained BERT with Relational Graph Convolutional Network (R-GCN). R-GCN is used for digesting structural syntactic information and learning better task-specific embeddings. Empirical results demonstrate that, under explicit syntactic supervision and without the need to fine tune BERT, R-GCN's embeddings outperform the original BERT embeddings on the coreference task. Our work obtains the state-of-the-art results on GAP dataset, and significantly improves the snippet-context baseline F1 score from 66.9% to 80.3%. We participated in the 2019 GAP Coreference Shared Task, and our codes are available online. The overall architecture is shown below.
![](https://i.imgur.com/aAK43SM.png)

## Dataset we have
The data set is Gendered Ambiguous Pronouns (GAP), which is a gender-balanced dataset containing 8908 coreference-labeled pairs sampled from Wikipedia. The dataset contains samples Each sample contains a small paragraph that mentions the potential subject's names later refered by a target pronoun. It also came up with two candidate names for the resolver to choose from. Columns contains:

| Header | Description |
| :------------- | :----------: |
| ID | ID for this sample |
| Text | Text containing pronoun and two names |
| Pronoun | Target pronoun in text |
| Pronoun-offset | Character offset in text |
| A | Name A in text |
| A-offset | Position of A in the text |
| A-coref | Whether A confers this pronoun |
| B | Name B in text |
| B-offset | Position of B in the text |
| A-coref | Whether B confers this pronoun |

## Data Preprocessing

We use SpaCy as our syntactic denpendency parser. DGL is used to transfer each dependency tree into a graph object. This DGL graph object then can be used as the input for GCN model which is also implemented by DGL. Several graphs are grouped together as a larger DGL batch-graph object for batch training setting.