https://github.com/xuanwang91/ChemNER
https://github.com/xuanwang91/ChemNER
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/xuanwang91/ChemNER
- Owner: xuanwang91
- Created: 2021-09-09T01:45:13.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-03-10T00:44:51.000Z (about 1 year ago)
- Last Synced: 2024-08-04T02:07:39.358Z (10 months ago)
- Language: Jupyter Notebook
- Size: 3.42 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Fine-grained-Entity-Typing-Papers - [link - main.424.pdf)] (Datasets)
README
# ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision
## Code
The code for distant supervision generation is in ```corpus.ipynb```. The next step is to train a standard sequence labeling model (Bi-LSTM, RoBERTa, ChemBERTa, ...) based on distant supervision.## Data
The data is in the folder ```/data```. The training data is too big to be uploaded and can be found here: [CHEM_train.json](https://virginiatech-my.sharepoint.com/personal/xuanw_vt_edu/_layouts/15/download.aspx?UniqueId=eff0d607e51041b78813c8d3b06683a3&e=jNf2Hi). The human-annotated test data is in ```/data/CHEM_test_annotations.jsonl```.## Citation
```
@inproceedings{wang2021chemner,
title={ChemNER: Fine-grained chemistry named entity recognition with ontology-guided distant supervision},
author={Wang, Xuan and Hu, Vivian and Song, Xiangchen and Garg, Shweta and Xiao, Jinfeng and Han, Jiawei},
booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
year={2021}
}
```