https://github.com/genentech/gneseqcoo
https://github.com/genentech/gneseqcoo
Last synced: 10 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/genentech/gneseqcoo
- Owner: Genentech
- License: other
- Created: 2024-08-14T17:27:51.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-02T23:39:34.000Z (over 1 year ago)
- Last Synced: 2025-06-10T23:07:50.509Z (12 months ago)
- Language: R
- Homepage: https://genentech.github.io/gneSeqCOO/
- Size: 3.98 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gneSeqCOO
This package contains code for applying a Cell of Origin classifier to RNASeq data. The process consists of three steps:
1) normalizing
2) calculating the LPS score
3) splitting samples into GCB/ABC based on fixed cut-points.
# Install
```
remotes::install_github("Genentech/gneSeqCOO")
```
# Basic usage
The input for the gneSeqCOO method is a raw count matrix, formatted as a DESeqDataSet. For a standard count matrix, this can be generated via the code below:
```{r,eval=FALSE}
## counts is a matrix, with rows of features and columns of samples.
cds = DESeqDataSetFromMatrix(counts,
colData=data.frame(ID=colnames(counts)),
rowData=data.frame(ID=rownames(counts)),
design=~1)
```
Note that in order for the classifier to work, it is necessary for the row names of our CDS to be in the correct format. The classifier currently accepts Refseq “Gene IDs” (formatted as “GeneID:9294”) or Ensembl IDs (ENSG....). Genes in the dataset should not be filtered before running the algorithm, as the complete genome is used to normalize individual samples.
The COO classifier can be applied to the cds object using the single command, coo_rnaseq(). An example is below:
```{r,eval=FALSE}
pred = coo_rnaseq(cds)
```
The returned object consists of three columns:
- Sample IDs, drawn from the column names of the cds object.
- LPS - the Linear Predictor Score, used to split samples into GCB and ABC, and
- COO - the Cell of Origin classification; either GCB, ABC, or Unclassified
For more details, please refer to the docmumentation in the package.