An open API service indexing awesome lists of open source software.

https://github.com/bhklab/circrna-detection


https://github.com/bhklab/circrna-detection

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

        

## Detection of non-coding RNAs and assessment of their predictive value for mono therapies

Julia Nguyen1,\$, Anthony Mammoliti1,2,\$, Sisira Kadambat Nair1, Emily So1,2, Farnoosh Abbas Aghababazadeh1, Christoper Eeles1, Ian Smith1,2, Petr Smirnov1,2, Housheng Hansen He1,2, Ming-Sound Tsao1,2, Benjamin Haibe-Kains1,2,3,4,5,6,#

1Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada\
2Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada\
3Department of Computer Science, University of Toronto, Toronto, Ontario, Canada\
4Ontario Institute of Cancer Research, Toronto, Ontario, Canada\
5Vector Institute for Artificial Intelligence, Toronto, ON M5G 1L7, Canada\
6Biostatistics Division, Dalla Lana School of Public Health, Toronto, ON M5T 3M7, Canada

$ These authors contributed equally to the present work\
# Corresponding author: Benjamin Haibe-Kains, Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2C4 Canada


### **Transcript Quantification**

*Description:* circRNA, mRNA, and corresponding gene expression was quantified across 48 cell line biological replicates from three pharmacogenomic datasets (gCSI, CCLE, GDSC2) with poly(A)-selected RNA-seq data. Consistency of transcript quantification was compared using UMAP. Transcript stability was assessed across the dataset pairs.


**1.** Reproducible Snakemake pipelines for running CIRI2 and CIRCexplorer2 on cell line biological replicates and lung cancer patient data can be found under: "/data/snakemake_pipelines"

**2.** Transcript expression quantification (run script: "/code/transcript_quantification.R")

**3.** Dimensionality reduction analysis of transcript quantification (run script: "/code/umap.R")

**4.** Computation of stability indices of transcripts across dataset pairs (run script: "/code/si_distribution.R")


### **circRNA**

*Description:* CIRI2 and CIRCexplorer2 was ran on 48 cell line biological replicates (gCSI, CCLE, GDSC2) with poly-A selected RNA-seq data and lung cancer patient data with both poly-A + Ribo-depleted RNA-seq data. circRNA detection levels were compared between cell line and patient data. A survival analysis (Overall Survival) was performed using the Ribo-depleted lung patient data due to yielding high circRNA expression, in comparison to the cell lines.


**1.** Cell line biological replicate analysis - data under "/data/processed_cellline" (run script: "/code/circRNA_celllines.R")

**2.** Lung adenocarcinoma patient analysis - data under "/data/processed_lung" (run script "/code/circRNA_lung.R")

**3.** Quantification of unique circRNA transcripts and distribution across RNA-seq selection protocols (run script "/code/unique_lung_transcripts.R")

**4.** Survival analysis on lung adenocarcinoma patient data (run script: "/code/survival_lung.R")


### **mRNA**

*Description:* Features that may predict mRNA stability were processed, and influence was computed through supervised learning. The association between mRNA stability and drug response was investigated.


**1.** Generation of data.frame with transcript stability indices and genomic feature data - data under - "/data/transcript_stability" (run script: "/code/transcript_stability.R")

**2.** Identification of quantile usage for stable/unstable transcripts using Kuncheva Index (run script: "/code/overlap.R)

**3.** Feature influence after permuting each feature (n=20,000) (run script: "/code/feature_influence.R")

**4.** Assessing association between stability and predictive value for *known* gene biomarkers (run script: "/code/biomarker_forest.R")

**5.** Assessing association between stability and predictive value for across *all* genes (run script: "/code/hist_pred_value.R")