https://github.com/bhklab/cdrug2
Code for "Revisiting Inconsistency in Large Pharmacogenomic Studies"
https://github.com/bhklab/cdrug2
analysis pipeline reproducibility
Last synced: 3 months ago
JSON representation
Code for "Revisiting Inconsistency in Large Pharmacogenomic Studies"
- Host: GitHub
- URL: https://github.com/bhklab/cdrug2
- Owner: bhklab
- Created: 2015-08-11T12:15:15.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2017-07-12T15:38:23.000Z (almost 8 years ago)
- Last Synced: 2024-12-30T03:20:57.996Z (5 months ago)
- Topics: analysis, pipeline, reproducibility
- Language: R
- Homepage:
- Size: 48.8 KB
- Stars: 2
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Code for "Revisiting inconsistency in large pharmacogenomic studies"
## Full reproducibility of the analysis results
We will describe how to fully reproduce the figures and tables reported in the main manuscript. We automated the analysis pipeline so that minimal manual interaction is required to reproduce our results. To do this, one must simply:* Set up the software environment
* Run the R scripts
* Generate the Supplementary InformationThe code and associated files are publicly available on GitHub: https://github.com/bhklab/cdrug2
## Set up the software environment
We developed and tested our analysis pipeline using R running on Linux and Mac OSX platforms.
To mimic our software environment the following R packages should be installed.
* R version 3.3.1 (2016-06-21), x86_64-apple-darwin13.4.0
* Base packages: base, datasets, graphics, grDevices, grid, methods, parallel, stats, utils
* Other packages: Biobase 2.28.0, BiocGenerics 0.14.0, biomaRt 2.24.1, devtools 1.12.0, Formula 1.2-1, futile.logger 1.4.3, genefu 1.18.0, ggplot2 2.1.0, Hmisc 3.17-4, lattice 0.20-33, mclust 5.2, PharmacoGx 1.3.2, pROC 1.8, prodlim 1.5.7, psych 1.6.6, RColorBrewer 1.1-2, rJava 0.9-8, survcomp 1.18.0, survival 2.39-5, VennDiagram 1.6.17, xlsx 0.5.7, xlsxjars 0.6.1, xtable 1.8-2
* Loaded via a namespace (and not attached): acepack 1.3-3.3, amap 0.8-14, AnnotationDbi 1.30.1, BiocInstaller 1.18.5, bitops 1.0-6, bootstrap 2015.2, caTools 1.17.1, celestial 1.3, chron 2.3-47, cluster 2.0.4, colorspace 1.2-6, curl 1.2, data.table 1.9.6, DBI 0.5, digest 0.6.10, downloader 0.4, foreign 0.8-66, futile.options 1.0.0, gdata 2.17.0, GenomeInfoDb 1.4.3, git2r 0.15.0, gplots 3.0.1, gridExtra 2.2.1, gtable 0.2.0, gtools 3.5.0, httr 1.2.1, igraph 1.0.1, IRanges 2.2.9, KernSmooth 2.23-15, lambda.r 1.1.9, latticeExtra 0.6-28, lava 1.4.4, limma 3.24.15, lsa 0.73.1, magicaxis 2.0.0, magrittr 1.5, mapproj 1.2-4, maps 3.1.1, marray 1.46.0, MASS 7.3-45, Matrix 1.2-7, memoise 1.0.0, mnormt 1.5-4, munsell 0.4.3, nnet 7.3-12, piano 1.8.2, plotrix 3.6-3, plyr 1.8.4, R6 2.1.3, RANN 2.5, Rcpp 0.12.6, RCurl 1.95-4.8, relations 0.6-6, rmeta 2.16, rpart 4.1-10, RSQLite 1.0.0, S4Vectors 0.6.6, scales 0.4.0, sets 1.0-16, slam 0.1-37, sm 2.2-5.4, SnowballC 0.5.1, splines 3.3.1, stats4 3.3.1, SuppDists 1.1-9.2, survivalROC 1.0.3, tools 3.3.1, withr 1.0.2, XML 3.98-1.4All these packages are available on CRAN (http://cran.r-project.org) or Bioconductor (http://www.bioconductor.org).
Run the following commands in a R session to install all the required packages:
```R
source("http://bioconductor.org/biocLite.R")
biocLite(c("VennDiagram", "Hmisc", "xtable", "RColorBrewer", "pROC", "Biobase", "genefu"
"PharmacoGx", "xlsx"))
```Note that *PharmacoGx* requires that several packages are installed. However, all dependencies are available from CRAN or Bioconductor.
Once the packages are installed, clone the *cdrug2* GitHub repository (https://github.com/bhklab/cdrug2) This should create a directory on the file system containing the following files:
* cdrug2_analysis.R Script generating all the figures and tables reported in the manuscript.
* cdrug2_foo.R Additional functions implemented specifically for the analysis and results visualization.All the files required to run the automated analysis pipeline are now in place. It is worth noting that 500 MB storage for downloading CGP and CCLE PSets which is done Automatically when user runs \texttt{cdrug2\_analysis.R} script.
Run the R scripts
Open a terminal window and go to the \texttt{cdrug} directory. You can easily run the analysis pipeline either in batch mode or in a R session.To run the full analysis pipeline in an R session, simply type the following command:
```R
nbcore <- 4
### to allocate four CPU cores for instance.
source("code/cdrug2_analysis_.R")
```Key messages will be displayed to monitor the progress of the analysis.
The analysis pipeline was developed so that all intermediate analysis results are saved in the directory output. Therefore, in case of interruption, the pipeline will restart where it stopped.
## Generate the Supplementary Information
After completion of the analysis pipeline a directory output will be created to contain all the intermediate results, tables and figures reported in the main manuscript and this Supplementary Information.