https://github.com/plandes/medacy_bert_model_clinical_notes
Clinical Notes Model for medaCy (BERT)
https://github.com/plandes/medacy_bert_model_clinical_notes
Last synced: 8 months ago
JSON representation
Clinical Notes Model for medaCy (BERT)
- Host: GitHub
- URL: https://github.com/plandes/medacy_bert_model_clinical_notes
- Owner: plandes
- License: gpl-3.0
- Created: 2021-03-31T14:39:16.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-03-31T14:40:00.000Z (over 4 years ago)
- Last Synced: 2025-01-02T04:14:38.572Z (10 months ago)
- Language: Python
- Size: 17.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://spacy.io)
# medaCy
:hospital: Clinical Notes Model for medaCy (BERT) :hospital:This repository contains a versioned, medaCy compatible Model for information extraction from clinical notes.

# Description
This is the light-weight version (no metamap) of medaCy's model for extracting 9 unique entities from clinical notes:`Drug, Strength, Duration, Route, Form, ADE, Dosage, Reason, Frequency`
# Results
Model generalization ability is evaluated over 202 patient clinical note files not seen during training. *Strict* indicates exact matches of spans, *Lenient* indicates a fuzzy matching of spans (model predictions are off by single characters).| Entity (Count) | Precision | Recall | F1 | F1_Min | F1_Max |
|-------------------|-------------|----------|-------|----------|----------|
| ADE (1584) | 0.562 | 0.301 | 0.381 | 0.216 | 0.457 |
| Dosage (6902) | 0.942 | 0.953 | 0.948 | 0.939 | 0.958 |
| Drug (26800) | 0.904 | 0.891 | 0.897 | 0.891 | 0.904 |
| Duration (970) | 0.833 | 0.821 | 0.825 | 0.779 | 0.862 |
| Form (11010) | 0.93 | 0.941 | 0.936 | 0.924 | 0.954 |
| Frequency (10293) | 0.873 | 0.966 | 0.917 | 0.908 | 0.926 |
| Reason (6400) | 0.663 | 0.528 | 0.586 | 0.563 | 0.612 |
| Route (8989) | 0.933 | 0.924 | 0.928 | 0.916 | 0.939 |
| Strength (10921) | 0.948 | 0.958 | 0.953 | 0.945 | 0.957 |
| system (83869) | 0.893 | 0.9 | 0.895 | 0.889 | 0.901 |# Training Data
N2C2 2018 Shared Task
The data used to induce this model is protected by HIPAA privacy regulations and thus cannot be published.Authors
=======
Andriy Mulyar and Bridget McInnesAcknowledgments
===============
- [VCU Natural Language Processing Lab](https://nlp.cs.vcu.edu/) 
- [Nanoinformatics Vertically Integrated Projects](https://rampages.us/nanoinformatics/)