https://github.com/osainz59/biopmds
Distantly Supervised corpus for pedagogically motivated relation extraction in Biology domain.
https://github.com/osainz59/biopmds
automatic-annotation biology distant-supervision pedagogy relation-extraction semi-supervised-learning
Last synced: 2 months ago
JSON representation
Distantly Supervised corpus for pedagogically motivated relation extraction in Biology domain.
- Host: GitHub
- URL: https://github.com/osainz59/biopmds
- Owner: osainz59
- License: apache-2.0
- Created: 2020-03-05T10:27:03.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-09-18T12:06:11.000Z (about 5 years ago)
- Last Synced: 2025-04-13T01:23:58.630Z (6 months ago)
- Topics: automatic-annotation, biology, distant-supervision, pedagogy, relation-extraction, semi-supervised-learning
- Homepage:
- Size: 4.01 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## [BioPMDS] Pedagogically Motivated Distantly Supervised Relation Extraction dataset for Biology domain
Download the corpus directly from the GitHub [repository](https://github.com/osainz59/BioPMDS). The corpus in JSON format is stored on corpus/ folder.
### Citation
```bibtex
@inproceedings{sainz-etal-2020-domain,
title = "Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction",
author = "Sainz, Oscar and
Lopez de Lacalle, Oier and
Aldabe, Itziar and
Maritxalar, Montse",
booktitle = "Proceedings of The 12th Language Resources and Evaluation Conference",
month = may,
year = "2020",
address = "Marseille, France",
publisher = "European Language Resources Association",
url = "https://www.aclweb.org/anthology/2020.lrec-1.270",
pages = "2213--2222",
abstract = "In this paper we present a relation extraction system that given a text extracts pedagogically motivated relation types, as a previous step to obtaining a semantic representation of the text which will make possible to automatically generate questions for reading comprehension. The system maps pedagogically motivated relations with relations from ConceptNet and deploys Distant Supervision for relation extraction. We run a study on a subset of those relationships in order to analyse the viability of our approach. For that, we build a domain-specific relation extraction system and explore two relation extraction models: a state-of-the-art model based on transfer learning and a discrete feature based machine learning model. Experiments show that the neural model obtains better results in terms of F-score and we yield promising results on the subset of relations suitable for pedagogical purposes. We thus consider that distant supervision for relation extraction is a valid approach in our target domain, i.e. biology.",
language = "English",
ISBN = "979-10-95546-34-4",
}
```