https://github.com/phoenix-ji/pytorch-tfpner
TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech
https://github.com/phoenix-ji/pytorch-tfpner
attention-mechanism bert-fine-tuning bert-model cased-characters embedding-layer model-ensemble named-entity-recognition part-of-speech
Last synced: about 2 months ago
JSON representation
TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech
- Host: GitHub
- URL: https://github.com/phoenix-ji/pytorch-tfpner
- Owner: Phoenix-JI
- Created: 2022-02-05T14:26:58.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-07T05:43:06.000Z (about 3 years ago)
- Last Synced: 2025-01-14T06:46:14.842Z (4 months ago)
- Topics: attention-mechanism, bert-fine-tuning, bert-model, cased-characters, embedding-layer, model-ensemble, named-entity-recognition, part-of-speech
- Language: Jupyter Notebook
- Homepage:
- Size: 7.86 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TFPNER
## TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech
Named entity recognition (NER), which aims at identifying real-world entity mentions from texts, is a fundamental task in natural language processing with a wide range of applications.
Previous approaches mainly focus on the original pure sentence but the Part of speech (POS) contains rich semantic information and contribute to the success of the Natural Language Processing task.
The baseline is the BERT model with the original token. To further improve the performance of the NER task, we proposed the five methods that employed POS tags fused with the original tokens based on the BERT model to achieve the NER task, including concatenating token and POS as one or two sentences, adding POS embedding as one of the embedding elements, model ensemble, and conduct the multi-attention between the token representations and POS representations.
In this work, we addressed the CoNLL-2003 and Groningen Meaning Bank (GMB) datasets which can provide both NER tags and POS tags. From our experiments on two datasets, part of the proposed methods can show performance improvement in comparison with the baseline methods.## Model
Here is the model we built to get the higher performance
### Token + POS

### Token + [SEP] +POS

### POS Embedding Layer

### Token POS Attention

### Model Ensemble

## Result
### The experimental result on CoNLL-2003 dataset
