Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/emma-wilson/in-vitro-screening
Open data and code for published paper.
https://github.com/emma-wilson/in-vitro-screening
machine-learning open-code open-data screening
Last synced: 4 days ago
JSON representation
Open data and code for published paper.
- Host: GitHub
- URL: https://github.com/emma-wilson/in-vitro-screening
- Owner: emma-wilson
- Created: 2022-07-14T11:59:52.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-07T18:06:55.000Z (almost 2 years ago)
- Last Synced: 2024-11-20T11:04:06.897Z (2 months ago)
- Topics: machine-learning, open-code, open-data, screening
- Language: R
- Homepage: https://doi.org/10.1042/CS20220594
- Size: 4.32 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# In Vitro Screening Comparison
Data and code to accompany the paper ["Screening for in vitro systematic reviews: a comparison of screening methods and training of a machine learning classifier"](https://doi.org/10.1042/CS20220594) published in *Clinical Science*.
## Code scripts
All code is written in R using R Markdown documents. R version number and package version numbers are included in each script. Details of each script are below. Run each script in order.
- **1-1_data cleaning.Rmd:** Prepare screening method comparison data for analysis; remove excluded data
- **1-2_calculate_performance.Rmd:** Calaculate the performace (sensitivity and specificity) of screening methods at various thresholds
- **1-3_analyse_performance.Rmd:** Plot the performance in a ROC curve and determine the optimal threshold for regex screening methods
- **1-4_search_term_retrieval_comparison.Rmd:** Additional analysis comparing retrieval of studies using the planned vs actual search terms## Data
Data are stored in the following folders:
- **data-raw:** raw data to be processed
- **data:** data which have undergone some processing
- **data-analysis:** final clean datasets
- **data-ml_input:** input data required to train ML
- **data-ml_output:** output data from ML## Functions
Machine learning (ML) functions are in the `functions` folder. Please not that the information required to configure the ML API are **not** included as we do not have permission to share this.
## Plots
Plot outputs (in PDF file format) are in the `figures` folder.
- **regex_histogram.pdf:** histograms showing number of regex matches against (a) tiab and (b) full text
- **screening_roc.pdf:** figure from screening comparison part of project
- **ml_roc.pdf:** figure from machine learning part of project