Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jdblischak/midway-subread-pipeline
Process RNA-seq data with Subread on RCC Midway
https://github.com/jdblischak/midway-subread-pipeline
bioconductor rna-seq
Last synced: 20 days ago
JSON representation
Process RNA-seq data with Subread on RCC Midway
- Host: GitHub
- URL: https://github.com/jdblischak/midway-subread-pipeline
- Owner: jdblischak
- License: cc0-1.0
- Created: 2017-07-17T18:30:17.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-09-14T02:42:59.000Z (over 7 years ago)
- Last Synced: 2024-10-15T23:55:26.547Z (2 months ago)
- Topics: bioconductor, rna-seq
- Language: R
- Size: 10.7 KB
- Stars: 2
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Midway Subread pipeline
These scripts process RNA-seq data using the [Subread][] pipeline. The
scripts are designed to be used with the [Slurm scheduler][slurm] on
the [RCC Midway2][midway] cluster at the University of Chicago, but
can be adapted to other computing infrastructure.## Setup
1. Install the R/Bioconductor packages [biomaRt][] and [Rsubread][]:
```r
source("https://bioconductor.org/biocLite.R")
biocLite(c("biomaRt", "Rsubread"))
```2. Clone this repository with `git clone` or download and unzip this
[zip file][master].## Pipeline
The scripts should be submitted in the following order. Always run
them from the head node in the root of the project directory.1. Download and index the genome with `download-genome.R`:
```bash
sbatch --mem=12G --partition=broadwl download-genome.R
```2. Download and format the exons with `download-exons.R`:
```bash
sbatch --mem=2G --partition=broadwl download-exons.R
```3. Submit mapping jobs with `submit-subread.sh`, which calls
`run-subread.R` for each fastq file:```bash
bash submit-subread.sh
```4. Submit counting jobs with `submit-subread.sh`, which calls
`run-featurecounts.R` for each BAM file:```bash
bash submit-subread.sh
```## Directory structure
These scripts expect the following directory structure. Only `fastq/`
has to be created manually. The scripts will create `bam/`, `counts/`,
and `genome/`.```
.
├── bam
├── counts
├── fastq
│ ├── YG-172S-S8-8_S125_L006_R1_001.fastq.gz
│ └── YG-172S-S8-9_S126_L006_R1_001.fastq.gz
└─── genome
```* `fastq/` - Contains raw data in `fastq.gz` files.
* `bam/` - Contains the mapped reads in BAM files.
* `counts/` - Contains the read counts in text files.
* `genome/` - Contains the index genome and the exons coordinates in
SAF format.## License
The code is available under the [CC0 license][cc0], i.e. you are free
to do whatever you like with this code, but it comes with no
guarantees. See the file `LICENSE` for full details.## More information
* [RCC User Guide][guide]
* [Gilad Lab Midway Guide][giladlab][biomaRt]: https://bioconductor.org/packages/release/bioc/html/biomaRt.html
[cc0]: https://creativecommons.org/share-your-work/public-domain/cc0/
[guide]: https://rcc.uchicago.edu/docs/
[giladlab]: https://github.com/jdblischak/giladlab-midway-guide
[master]: https://github.com/jdblischak/midway-subread-pipeline/archive/master.zip
[midway]: https://rcc.uchicago.edu/resources/high-performance-computing
[Rsubread]: https://bioconductor.org/packages/release/bioc/html/Rsubread.html
[slurm]: https://slurm.schedmd.com/
[Subread]: http://subread.sourceforge.net/