https://github.com/sk-sahu/snakeflows
Collective Snakemake workflows for easy and reproducible NGS data analysis.
https://github.com/sk-sahu/snakeflows
rna-seq-analysis snakemake-workflows
Last synced: about 1 year ago
JSON representation
Collective Snakemake workflows for easy and reproducible NGS data analysis.
- Host: GitHub
- URL: https://github.com/sk-sahu/snakeflows
- Owner: sk-sahu
- License: mit
- Created: 2019-02-18T13:53:14.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2023-08-05T13:54:53.000Z (almost 3 years ago)
- Last Synced: 2025-01-14T22:03:53.808Z (over 1 year ago)
- Topics: rna-seq-analysis, snakemake-workflows
- Language: Python
- Homepage: https://sksahu.net/Snakeflows
- Size: 305 KB
- Stars: 5
- Watchers: 2
- Forks: 1
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Snakeflows
[](https://snakemake.readthedocs.io/en/stable/)
[](https://snakemake-rnaseq-workflows.readthedocs.io/en/latest/?badge=latest)
**Collective Snakemake workflows for easy and reproducable NGS data analysis.**
Workflows may contain modified parameters, Please look at `snakemake` files before use.
### RNA-Seq Analysis
Currct workflows
* [STAR-Cufflinks](./STAR-Cufflinks)
* [Salmon](./Salmon)
Workflows are making in progress. I will add more downstreem tools as go along.
You have an workflow in mind too add! Requet it here - [](https://github.com/sk-sahu/Snakemake-RNASeq-Workflows/issues/new?assignees=&labels=&template=feature_request.md&title=)
### Quick start
#### Pre-requirements
You need to have [Python3](https://www.python.org/downloads/release/python-356/) installed in your system with `conda` enabled.
Install needed tools with following command
```bash
conda env create -f environment.yml
conda activate snakeflows
```
Downlorad sample data to test the workflow:
```
wget https://www.dropbox.com/s/bnvjbhq4970pvg8/sample_data.tar.gz?dl=0
```
#### 1. Prepare **samples** directory properly
Before you run `write_sample_to_json.py`, **samples** directory arangement and it's naming needs to be proper such that it can be read by the script and call furthere in `snakemake` files.
Something like this:
```
samples
├── SET1_dummy
│ ├── SET1_dummy_R1.fastq.gz
│ └── SET1_dummy_R2.fastq.gz
└── SET3_dummy
├── SET3_dummy_R1.fastq.gz
└── SET3_dummy_R2.fastq.gz
```
#### 2. Generate `samples.json` file
This will be used to automatic detect samples names and call them in `snakemake` files.
```bash
python3 write_sample_to_json.py --fastq_dir full_path_to_samples_directory
```
#### 3. Run Workflows
First Edit the `config.yml` files inside workflow directory with required full paths.
Then simply call `snakemake` from workflow directory (With additional parameters if required)
```
snakemake --cores 12
```
#### Additional
For checking workflow and debug
```bash
snakemake -np
```
Visualise the workflow
```bash
snakemake --forceall --dag | dot -Tpng | display
```
Upcoming additons:
* Docker integration 

* Streamlined html reports
* Log files with timestamp
* Making it more modular with .smk files