{"id":18448846,"url":"https://github.com/sequana/denovo","last_synced_at":"2025-04-15T11:27:00.651Z","repository":{"id":90181454,"uuid":"227478685","full_name":"sequana/denovo","owner":"sequana","description":"Denovo Assembly from FASTQ files","archived":false,"fork":false,"pushed_at":"2024-03-25T14:27:16.000Z","size":420,"stargazers_count":0,"open_issues_count":2,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-16T13:31:06.376Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sequana.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-12-11T23:19:40.000Z","updated_at":"2022-11-18T09:29:20.000Z","dependencies_parsed_at":"2024-03-25T15:36:34.392Z","dependency_job_id":null,"html_url":"https://github.com/sequana/denovo","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fdenovo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fdenovo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fdenovo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fdenovo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sequana","download_url":"https://codeload.github.com/sequana/denovo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249060102,"owners_count":21206277,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T07:17:26.163Z","updated_at":"2025-04-15T11:27:00.578Z","avatar_url":"https://github.com/sequana.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n.. image:: https://badge.fury.io/py/sequana-denovo.svg\n     :target: https://pypi.python.org/pypi/sequana_denovo\n\n.. image:: https://github.com/sequana/denovo/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/denovo/actions/workflows/main.yml\n\n.. image:: https://coveralls.io/repos/github/sequana/denovo/badge.svg?branch=main\n    :target: https://coveralls.io/github/sequana/denovo?branch=main\n\n.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg\n    :target: https://pypi.python.org/pypi/sequana\n    :alt: Python 3.8 | 3.9 | 3.10\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n   :target: http://joss.theoj.org/papers/10.21105/joss.00352\n   :alt: JOSS (journal of open source software) DOI\n\nThis is is the **denovo** pipeline from the `Sequana \u003chttps://sequana.readthedocs.org\u003e`_ projet\n\n\n:Overview: a de-novo assembly pipeline for short-read sequencing data\n:Input: A set of FastQ files\n:Output: Fasta, VCF, HTML report\n:Status: production\n:Citation: Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI doi:10.21105/joss.00352\n\n\nInstallation\n~~~~~~~~~~~~\n\n**sequana_denovo** is based on Python3, just install the package as follows::\n\n    pip install sequana --upgrade\n\nYou will need third-party software such as fastqc. Please see below for details.\n\nUsage\n~~~~~\n\nThe following command will scan all files ending in .fastq.gz found in the local\ndirectory, create a directory called denovo/ where a snakemake pipeline is\nstored. Depending on the number of files and their sizes, the\nprocess may be long::\n\n::\n\n    sequana_denovo --help\n    sequana_denovo --input-directory DATAPATH\n\nThis creates a directory with the pipeline and configuration file. You will then need\nto execute the pipeline::\n\n    cd denovo\n    sh denovo.sh  # for a local run\n\nThis launch a snakemake pipeline. If you are familiar with snakemake, you can\nretrieve the pipeline itself and its configuration files and then execute the pipeline yourself with specific parameters::\n\n    snakemake -s denovo.smk -c config.yaml --cores 4 --stats stats.txt\n\nOr use `sequanix \u003chttps://sequana.readthedocs.io/en/main/sequanix.html\u003e`_ interface.\n\nRequirements\n~~~~~~~~~~~~\n\nThis pipelines requires the following executable(s):\n\n- spades\n- busco\n- bwa\n- khmer : there is not executable called kmher but a set of executables (.e.g .normalize-by-median.py)\n- freebayes\n- picard\n- prokka\n- quast\n- spades\n- sambamba\n- samtools\n\n\n\n.. image:: https://raw.githubusercontent.com/sequana/sequana_denovo/main/sequana_pipelines/denovo/dag.png\n\n\nDetails\n~~~~~~~~~\n\n\nSnakemake *de-novo* assembly pipeline dedicates to small genome like bacteria.\nIt is based on `SPAdes \u003chttp://cab.spbu.ru/software/spades/\u003e`_.\nThe assembler corrects reads and then assemble them using different size of kmer.\nIf the correct option is set, SPAdes corrects mismatches and short INDELs in\nthe contigs using BWA.\n\nThe sequencing depth can be normalised with `khmer \u003chttps://github.com/dib-lab/khmer\u003e`_.\nDigital normalisation converts the existing high coverage regions into a Gaussian\ndistributions centered around a lower sequencing depth. To put it another way,\ngenome regions covered at 200x will be covered at 20x after normalisation. Thus,\nsome reads from high coverage regions are discarded to reduce the quantity of data.\nAlthough the coverage is drastically reduce, the assembly will be as good or better\nthan assembling the unnormalised data. Furthermore, SPAdes with normalised data\nis notably speeder and cost less memory than without digital normalisation.\nAbove all, khmer does this in fixed, low memory and without any reference\nsequence needed.\n\nThe pipeline assess the assembly with several tools and approach. The first one\nis `Quast \u003chttp://quast.sourceforge.net/\u003e`_, a tools for genome assemblies\nevaluation and comparison. It provides a HTML report with useful metrics like\nN50, number of mismatch and so on. Furthermore, it creates a viewer of contigs\ncalled `Icarus \u003chttp://quast.sourceforge.net/icarus.html\u003e`_.\n\nThe second approach is to characterise coverage with sequana coverage and\nto detect mismatchs and short INDELs with\n`Freebayes \u003chttps://github.com/ekg/freebayes\u003e`_.\n\nThe last approach but not the least is `BUSCO \u003chttp://busco.ezlab.org/\u003e`_, that\nprovides quantitative measures for the assessment of genome assembly based on\nexpectations of gene content from near-universal single-copy orthologs selected\nfrom `OrthoDB \u003chttp://www.orthodb.org/\u003e`_.\n\n\n========= ====================================================================\nVersion   Description\n========= ====================================================================\n0.11.1    * Fix missing resources for quast/prokka/bwa_index\n0.11.0    * add checkm\n0.10.0    * use click / include multiqc apptainer\n0.9.0     * Major refactoring to include apptainers, use wrappers\n0.8.5     * add multiqc and use newest version of sequana\n0.8.4     * update pipeline to use new pipetools features\n0.8.3     * fix requirements (spades -\u003e spades.py)\n0.8.2     * fix readtag, update config to account for new coverage setup\n0.8.1\n0.8.0     **First release.**\n========= ====================================================================\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fdenovo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsequana%2Fdenovo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fdenovo/lists"}