Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-TCGA

Curated list of TCGA resources
https://github.com/IARCbioinfo/awesome-TCGA

  • TCGA project homepage
  • Infographic summary of the project
  • NCI TCGA Wiki - General help about TCGA project. One page you may visit often is the [TCGA barcode](https://wiki.nci.nih.gov/display/TCGA/TCGA+barcode) description.
  • Data documentation - Describe how the data is generated, in particular the details of the bioinformatics pipeline used.
  • Genomic Data Commons (GDC) data portal - The old TCGA data portal (https://tcga-data.nci.nih.gov/docs/publications/tcga/) is no longer operational and all TCGA data now resides at the GDC. Note that the GDC host other datasets than just the TCGA.
  • GDC homepage
  • GDC data documentation
  • GDC data release notes
  • GDC legacy archive - The legacy data is the original data that uses the old genome build (hg19) as produced by the original submitter. The legacy data is not actively being updated in any way. Users should migrate to the harmonized data.
  • List of cohorts with sample sizes - Shortcut to the GDC data portal with the list of all cancer sites with the number of cases and the number of available cases per data category.
  • GDC data transfert tool - Official command line tool, see [here](https://github.com/IARCbioinfo/GDC-tricks) for a nice tutorial.
  • GDC API - Official HTTP API. Note the [BAM Slicing](https://docs.gdc.cancer.gov/API/Users_Guide/BAM_Slicing/) that can be quite useful.
  • Firehose - Refers to the computational infrastructure.
  • Python and UNIX wrappers
  • FirebrowseR - An R package to download directly the results of the analyses performed by Firehose in R.
  • Firebrowse - A web UI to visualise the results of the analyses performed by Firehose.
  • TCGABiolinks - A R/Bioconductor package to search, download and prepare relevant data for analysis in R. Very powerful and well documented.
  • GDC Spreadsheet Download Tool - Tool to download clinical and/or biospecimen metadata for a given set of files in a tab-delimited format.
  • GenomicDataCommons - A R/Bioconductor package for querying, accessing, and mining genomic datasets available from the GDC.
  • gdctools - Broad Institute Python and UNIX CLI utilities to simplify search and retrieval of open-access data from the GDC.
  • Cancer Genomics Cloud - Developed by [Seven Bridges Genomics](https://www.sevenbridges.com). They have a [blog](https://www.sevenbridges.com/blog/) with useful case studies.
  • ISB Cancer Genomics Cloud - Developed by the Institute for Systems Biology in Seattle.
  • FireCloud - Developed by the BROAD Institute.
  • Firehose - See [above](https://github.com/IARCbioinfo/awesome-TCGA#broad-institute-gdac) for the associated tools to download the data. They run many software on all TCGA cohorts and make the results available.
  • Tumor Fusion Gene Data Portal - 9,966 tumor samples from 33 TCGA cancer types and 689 normal samples in 19 TCGA normal tissue types were analyzed by PRADA pipeline and the realigned BAM files of RNAseq data.
  • DriverDBv2 - WES and RNA-seq reanalysis to identify driver genes. Provides a nice graphical summary of mutation clustering in genes (e.g. for *[TP53](http://driverdb.tms.cmu.edu.tw/driverdbv2/gene_data_p.php?genename=TP53&geneproteinid=&submit=submit)*).
  • ASCAT Ploidy and Purity Estimates - [COSMIC](http://cancer.sanger.ac.uk/cosmic) hosts a tab separated table listing the ploidy and aberrant cell fraction (purity estimate), for TCGA samples re-analysed using ASCAT.
  • BioXpress - RNA-seq-derived gene expression database, including TCGA among others.
  • Official publication list
  • Publication from Seven Bridges Genomics
  • TCGABiolinks - A R/Bioconductor package to search, download and prepare relevant data for analysis in R. Very powerful and well documented.
  • FirebrowseR - An R package to download directly the results of the analyses performed by Firehose in R.
  • GenomicDataCommons - Paper describing the R GenomicDataCommons package.
  • DriverDBv2
  • ChimerDB - A new paper is in press for v3.0 according to rcsb.ewha.ac.kr/fusiongene.
  • BioXpress