https://github.com/ltla/cruktools
Assorted scripts for running server jobs at CRUK CI
https://github.com/ltla/cruktools
Last synced: about 1 year ago
JSON representation
Assorted scripts for running server jobs at CRUK CI
- Host: GitHub
- URL: https://github.com/ltla/cruktools
- Owner: LTLA
- Created: 2017-10-11T17:43:12.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2019-01-19T13:35:58.000Z (over 7 years ago)
- Last Synced: 2025-02-10T12:29:11.177Z (over 1 year ago)
- Language: Shell
- Size: 20.5 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CRUK tools
This repository provides some tools for processing genomics data on the CRUK Cambridge Institute SLURM server.
**Alignment**
- `solo_align.sh` provides a script for aligning a single library (single-end or paired-end).
- `multi_align.sh` is a convenience wrapper to submit alignment jobs for many libraries in a data set.
- `guess_encoding.py` guesses the Phred encoding for the aligner.
Alignment is performed using the [_subread_](http://subread.sourceforge.net/) aligner.
It also requires [_samtools_](http://www.htslib.org/) and [_MarkDuplicates_](https://broadinstitute.github.io/picard/).
**Read counting**
`counter.R` provides a template for read counting to produce a gene-by-sample count matrix.
It requires specification of the BAM files for which to perform the counting as well as a set of GTF annotation files.
It will use the `featureCounts` function in the [_Rsubread_](https://bioconductor.org/packages/Rsubread) package.
**Data mangling**
- `cram2fastq.sh` will convert a CRAM file into FASTQ for entry into the alignment pipelines above.
- `sanger_dump.sh` will convert an entire folder of CRAM files into FASTQs.
**Other**
`cell_ranger.sh` will call the [_CellRanger_](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger) pipeline to create a count matrix for single-cell transcriptomics data from the 10X Genomics platform.