https://github.com/solida-core/diva.wgs
https://github.com/solida-core/diva.wgs
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/solida-core/diva.wgs
- Owner: solida-core
- License: gpl-3.0
- Created: 2019-02-21T15:16:08.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2021-01-29T08:01:33.000Z (over 5 years ago)
- Last Synced: 2024-12-27T20:46:12.672Z (over 1 year ago)
- Language: Python
- Size: 76.2 KB
- Stars: 0
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](http://bioconda.github.io/)
[](https://snakemake.readthedocs.io/en/stable/)
[](https://travis-ci.com/solida-core/diva.wgs)
# DiVA WGS
**DiVA WGS** is a pipeline for Next-Generation Sequencing **Whole Genome** data anlysis.
All **[solida-core](https://github.com/solida-core)** workflows follow GATK Best Practices for Germline Variant Discovery, with the incorporation of further improvements and refinements after their testing with real data in various [CRS4 Next Generation Sequencing Core Facility](http://next.crs4.it) research sequencing projects.
Pipelines are based on [Snakemake](https://snakemake.readthedocs.io/en/stable/), a workflow management system that provides all the features needed to create reproducible and scalable data analyses.
Software dependencies are specified into the `environment.yaml` file and directly managed by Snakemake using [Conda](https://docs.conda.io/en/latest/miniconda.html), ensuring the reproducibility of the workflow on a great number of different computing environments such as workstations, clusters and cloud environments.
### Pipeline Overview
The pipeline workflow is composed by two major analysis sections:
* [_Mapping_](docs/diva_workflow.md#mapping): single and/or paired-end reads in fastq format are aligned against a reference genome to produce a deduplicated and recalibrated BAM file. This section is executed by DiMA pipeline.
* [_Variant Calling_](docs/diva_workflow.md#variant-calling): a joint call is performed from all project's bam files
Parallely, statistics collected during these steps are used to generate reports for [Quality Control](docs/diva_workflow.md#quality-control).
A complete view of the analysis workflow is provided by the pipeline's [graph](images/diva-wgs.png).
### Pipeline Handbook
**DiVA WGS** pipeline documentation can be found in the `docs/` directory:
1. [Pipeline Structure:](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md)
* [Snakefile](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#snakefile)
* [Configfile](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#configfile)
* [Rules](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#rules)
* [Envs](https://github.com/solida-core/docs/blob/master/pages/handbook/pipeline_struct.md#envs)
2. [Pipeline Workflow](docs/diva_workflow.md)
3. Required Files:
* [Reference files](docs/reference_files.md)
* [User files](docs/user_files.md)
4. Running the pipeline:
* [Manual Snakemake Usage](docs/diva_snakemake.md)
* SOLIDA:
* [CLI - Command Line Interface](https://github.com/solida-core/docs/blob/master/pages/solida/solida_cli.md)
* [GUI - Graphical User Interface](https://github.com/solida-core/docs/blob/master/pages/solida/solida_gui.md)
### Contact us
[support@solida-core](mailto:m.massidda@crs4.it)