https://github.com/karchinlab/2020plus
Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
https://github.com/karchinlab/2020plus
bioinformatics cancer driver-genes random-forest somatic-variants
Last synced: 7 months ago
JSON representation
Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
- Host: GitHub
- URL: https://github.com/karchinlab/2020plus
- Owner: KarchinLab
- License: apache-2.0
- Created: 2016-04-29T19:26:49.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2024-08-03T23:39:13.000Z (about 1 year ago)
- Last Synced: 2024-08-04T00:23:52.700Z (about 1 year ago)
- Topics: bioinformatics, cancer, driver-genes, random-forest, somatic-variants
- Language: Python
- Homepage: http://2020plus.readthedocs.org
- Size: 29.7 MB
- Stars: 48
- Watchers: 13
- Forks: 17
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 20/20+
## About
Next-generation DNA sequencing of the exome has detected hundreds of thousands of small somatic variants (SSV) in cancer. However, distinguishing genes containing driving mutations rather than simply passenger SSVs from a cohort sequenced cancer samples requires sophisticated computational approaches.
20/20+ integrates many features indicative of positive selection to predict oncogenes and tumor suppressor genes from small somatic variants.
The features capture mutational clustering, conservation, mutation *in silico* pathogenicity scores, mutation consequence types, protein interaction network connectivity, and other covariates (e.g. replication timing).
Contrary to methods based on mutation rate, 20/20+ uses ratiometric features of mutations by normalizing for the total number of mutations in a gene. This decouples the genes from gene-level differences in background mutation rate. 20/20+ uses monte carlo simulations to evaluate the significance of random forest scores based on an estimated p-value from an empirical null distribution.## Documentation
[](http://2020plus.readthedocs.io/en/latest/?badge=latest)
Please see the [documentation](http://2020plus.readthedocs.io/) on readthedocs.
## Releases
You can download [releases](https://github.com/KarchinLab/2020plus/releases) on github.
## Installation
[](https://travis-ci.org/KarchinLab/2020plus)
20/20+ is designed to run on *linux* operating systems.
We recommend that you install the dependencies for 20/20+ through [conda](https://conda.io/miniconda.html). Once conda is installed, setting up the environment is done as follows:
```bash
$ conda env create -f environment_python.yml # install dependencies for python
$ source activate 2020plus # activate the 20/20+ conda environment
$ conda install r r-randomForest rpy2 # install the R related dependencies
```Every time you wish to run 20/20+, you will then need to activate the "2020plus" conda environment.
```bash
$ source activate 2020plus
```The 20/20+ conda environment can also be deactivated.
```bash
$ source deactivate 2020plus
```