https://github.com/solida-core/diva.wgs

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/solida-core/diva.wgs
Owner: solida-core
License: gpl-3.0
Created: 2019-02-21T15:16:08.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2021-01-29T08:01:33.000Z (over 5 years ago)
Last Synced: 2024-12-27T20:46:12.672Z (over 1 year ago)
Language: Python
Size: 76.2 KB
Stars: 0
Watchers: 1
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          [![depends](https://img.shields.io/badge/depends%20from-bioconda-brightgreen.svg)](http://bioconda.github.io/)

[![snakemake](https://img.shields.io/badge/snakemake-5.3-brightgreen.svg)](https://snakemake.readthedocs.io/en/stable/)

[![Build Status](https://travis-ci.com/solida-core/diva.wgs.svg?branch=master)](https://travis-ci.com/solida-core/diva.wgs)

# DiVA WGS

**DiVA WGS** is a pipeline for Next-Generation Sequencing **Whole Genome** data anlysis.

All **[solida-core](https://github.com/solida-core)** workflows follow GATK Best Practices for Germline Variant Discovery, with the incorporation of further improvements and refinements after their testing with real data in various [CRS4 Next Generation Sequencing Core Facility](http://next.crs4.it) research sequencing projects.

Pipelines are based on [Snakemake](https://snakemake.readthedocs.io/en/stable/), a workflow management system that provides all the features needed to create reproducible and scalable data analyses.

Software dependencies are specified into the `environment.yaml` file and directly managed by Snakemake using [Conda](https://docs.conda.io/en/latest/miniconda.html), ensuring the reproducibility of the workflow on a great number of different computing environments such as workstations, clusters and cloud environments.

### Pipeline Overview

The pipeline workflow is composed by two major analysis sections:

 * [_Mapping_](docs/diva_workflow.md#mapping): single and/or paired-end reads in fastq format are aligned against a reference genome to produce a deduplicated and recalibrated BAM file. This section is executed by DiMA pipeline.

 * [_Variant Calling_](docs/diva_workflow.md#variant-calling): a joint call is performed from all project's bam files

 

Parallely, statistics collected during these steps are used to generate reports for [Quality Control](docs/diva_workflow.md#quality-control).

A complete view of the analysis workflow is provided by the pipeline's [graph](images/diva-wgs.png).

### Pipeline Handbook

**DiVA WGS** pipeline documentation can be found in the `docs/` directory:

1. [Pipeline Structure:](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md)

    * [Snakefile](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#snakefile)

    * [Configfile](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#configfile)

    * [Rules](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#rules)

    * [Envs](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#envs)

2. [Pipeline Workflow](docs/diva_workflow.md)

3. Required Files:

    * [Reference files](docs/reference_files.md)

    * [User files](docs/user_files.md)

4. Running the pipeline:

    * [Manual Snakemake Usage](docs/diva_snakemake.md)

    * SOLIDA:

        * [CLI - Command Line Interface](https://github.com/solida-core/docs/blob/master/pages/solida/solida_cli.md)

        * [GUI - Graphical User Interface](https://github.com/solida-core/docs/blob/master/pages/solida/solida_gui.md)

### Contact us

[support@solida-core](mailto:m.massidda@crs4.it)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/solida-core/diva.wgs

Awesome Lists containing this project

README