Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arudzinska/awroc_calculation
[Master Thesis 2017] Scripts for calculating metrics to assess performance of a drug design software.
https://github.com/arudzinska/awroc_calculation
drug-discovery receiver-operating-characteristic virtual-screening
Last synced: 29 days ago
JSON representation
[Master Thesis 2017] Scripts for calculating metrics to assess performance of a drug design software.
- Host: GitHub
- URL: https://github.com/arudzinska/awroc_calculation
- Owner: arudzinska
- Created: 2017-12-17T22:12:42.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-01-02T22:38:04.000Z (almost 7 years ago)
- Last Synced: 2023-10-20T02:48:33.911Z (about 1 year ago)
- Topics: drug-discovery, receiver-operating-characteristic, virtual-screening
- Language: Python
- Homepage:
- Size: 91.8 KB
- Stars: 2
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# awROC_calculation
Module to generate Receiver Operating Characteristic (**ROC**) curves, calculate **ROCE** (ROC Enrichments) and **AUC** (Area Under the Curve) - the standard ones, as well as their average weighted modification (**awROC** curve, **awROCE**, **awAUC**) which includes active compounds' clustering information. These metrics are used in virtual screening tools benchmarking tests to assess the performance of the software.
The module reads the output virtual screening ranking from PharmScreen, Pharmacelera's tool for ligand-based virtual screening (see [the webpage of Pharmacelera](https://www.pharmacelera.com/)). The output of the calculation is a CSV file with the enrichments (at 0.5%, 1%, 2% and 5% of false positives fraction retrieved), AUC and a PNG file with the ROC curve.
### ROC
ROC curve renders the ability of the tool to distinguish between two populations: true active compounds and decoys - inactive molecules. X and Y values of the ROC curve at the given point are calculated as follows:![equation-awroc](https://image.ibb.co/m4zEcG/eqn_roce.png)
where: *X%* is the fraction of the decoys retrieved at the chosen position of the virtual screening ranking.
When dividing the Y point value by the X point value one obtains the ROC Enrichment at the given retrieved decoys fraction. AUC is the area under the whole ROC curve.
### awROC
The average weighted modification inlcudes information about active compounds' clustering to evaluate the tool's ability to retrieve new scaffolds. The modified equation for awROC curve points and awROC Enrichments is as follows:![equation-awroce](https://image.ibb.co/kkpEcG/eqn_awroc.png)
where: *wij* = 1/*Nj* and is the weight of the *i*th structure from the *j*th cluster. *Nj* is the number of structures in given cluster. *αX%ij* is 1 or 0 depending on whether the *i*th structure of the *j*th cluster already (respectively) appeared or not in the chosen fraction of the dataset.
Similarly to the standard ROC curve, the awROC Enrichment can be calculated by dividing the Y point value by the X point value of the curve and awAUC is simply the area under the obtained curve.