{"id":18448827,"url":"https://github.com/sequana/mapper","last_synced_at":"2026-04-01T20:21:50.448Z","repository":{"id":40645809,"uuid":"238414913","full_name":"sequana/mapper","owner":"sequana","description":"Pipeline to map a set of FastQ files","archived":false,"fork":false,"pushed_at":"2024-10-19T19:00:44.000Z","size":857,"stargazers_count":5,"open_issues_count":2,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-10-19T23:22:18.915Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sequana.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-05T09:41:04.000Z","updated_at":"2024-10-19T19:00:48.000Z","dependencies_parsed_at":"2023-11-28T10:27:52.817Z","dependency_job_id":"449d2785-3cca-4525-b905-87450c80dc37","html_url":"https://github.com/sequana/mapper","commit_stats":{"total_commits":76,"total_committers":2,"mean_commits":38.0,"dds":0.07894736842105265,"last_synced_commit":"55c315b0ffbcb5251b7fb04c1e20fe3c42617eb0"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fmapper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fmapper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fmapper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fmapper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sequana","download_url":"https://codeload.github.com/sequana/mapper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223297200,"owners_count":17122020,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T07:17:24.579Z","updated_at":"2026-04-01T20:21:50.440Z","avatar_url":"https://github.com/sequana.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n.. image:: https://badge.fury.io/py/sequana-mapper.svg\n     :target: https://pypi.python.org/pypi/sequana_mapper\n\n.. image:: https://github.com/sequana/mapper/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/mapper/actions/\n\n.. image:: https://img.shields.io/badge/python-3.9%20%7C%203.10%20%7C%203.11%20%7C%20-blue.svg\n    :target: https://pypi.python.org/pypi/sequana\n    :alt: Python 3.9 | 3.10 | 3.11\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n   :target: http://joss.theoj.org/papers/10.21105/joss.00352\n   :alt: JOSS (journal of open source software) DOI\n\nThis is the **mapper** pipeline from the `Sequana \u003chttps://sequana.readthedocs.org\u003e`_ projet\n\n:Overview: This is a simple pipeline to map several FastQ files onto a reference using different mappers/aligners\n:Input: A set of FastQ files (illumina, pacbio, etc).\n:Output: A set of BAM files (and/or bigwig) and HTML report\n:Status: Production\n:Documentation: This README file, and https://sequana.readthedocs.io\n:Citation: Cokelaer et al, (2017), 'Sequana': a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352\n\nInstallation\n~~~~~~~~~~~~\n\nIf you already have all requirements, you can install the packages using pip::\n\n    pip install sequana_mapper --upgrade\n\nYou will need third-party software such as fastqc. Please see below for details.\n\nUsage\n~~~~~\n\nScan FastQ files in a directory and set up the pipeline (replace ``DATAPATH`` and ``genome.fa`` with your inputs)::\n\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --aligner-choice bwa\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --aligner-choice bwa --do-coverage\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --aligner-choice bwa --create-bigwig\n\nFor long-read data, use the dedicated presets::\n\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --pacbio     # sets minimap2 -x map-pb\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --nanopore   # sets minimap2 -x map-ont\n\nFor capture-seq projects (feature counting)::\n\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --capture-annotation-file targets.saf\n\nThis creates a ``mapper/`` directory with the pipeline and configuration file. Execute the pipeline locally::\n\n    cd mapper\n    sh mapper.sh\n\nSee ``.sequana/profile/config.yaml`` to tune Snakemake behaviour (cores, cluster settings, etc.).\n\nUsage with apptainer\n~~~~~~~~~~~~~~~~~~~~~\n\nWith apptainer, initiate the working directory as follows::\n\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --use-apptainer\n\nImages are downloaded in the working directory but you can store them in a shared location::\n\n    sequana_mapper --input-directory DATAPATH --reference-file genome.fa --use-apptainer --apptainer-prefix ~/.sequana/apptainers\n\nand then::\n\n    cd mapper\n    sh mapper.sh\n\n\nRequirements\n~~~~~~~~~~~~\n\nThis pipeline requires the following executables (install via bioconda/conda):\n\n- **bwa** — short-read aligner (default)\n- **minimap2** — long-read aligner (PacBio / Nanopore)\n- **bowtie2** — alternative short-read aligner\n- **samtools** / **sambamba** — BAM processing\n- **bamtools** — BAM statistics\n- **deeptools** — bigwig generation (``bamCoverage``)\n- **bedtools** — genome arithmetic\n- **subread** — feature counting (``featureCounts``, capture-seq only)\n- **mosdepth** — fast coverage depth\n- **seqkit** — FASTQ statistics\n- **multiqc** — aggregated HTML report\n- **sequana_coverage** — coverage analysis (prokaryotes)\n\nInstall all dependencies at once::\n\n    mamba env create -f environment.yml\n\n.. image:: https://raw.githubusercontent.com/sequana/mapper/main/sequana_pipelines/mapper/dag.png\n\n\nDetails\n~~~~~~~~~\n\nThis pipeline maps FastQ files (paired or single-end) in parallel onto a reference genome and produces\nfiltered BAM files, a MultiQC HTML report, and optionally coverage tracks and feature counts.\n\n**Aligner choice** (``--aligner-choice``):\n\n- ``bwa`` (default) — BWA-MEM; index algorithm is auto-selected (``is`` or ``bwtsw``) based on reference size\n- ``bwa_split`` — experimental; splits large FastQs into 1 M-read chunks for parallel BWA jobs, then merges\n- ``minimap2`` — long-read aligner; use ``--pacbio`` (sets ``-x map-pb``) or ``--nanopore`` (sets ``-x map-ont``)\n- ``bowtie2`` — standard short-read aligner\n\n**BAM filtering**: unmapped reads are removed to minimise file size. Statistics reported by MultiQC\n(in ``{sample}/bamtools_stats/``) still include both mapped and unmapped read counts.\n\n**Optional outputs**:\n\n- ``--do-coverage`` — runs ``sequana_coverage`` for depth-of-coverage analysis (prokaryotes)\n- ``--create-bigwig`` — generates bigwig files via ``bamCoverage`` (deeptools)\n- ``--capture-annotation-file`` — enables ``featureCounts`` for capture-seq efficiency metrics\n\n\n\nRules and configuration details\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere is the `latest documented configuration file \u003chttps://raw.githubusercontent.com/sequana/mapper/main/sequana_pipelines/mapper/config.yaml\u003e`_\nto be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file.\n\n\nChangelog\n~~~~~~~~~\n\n========= ======================================================================\nVersion   Description\n========= ======================================================================\n1.4.1     * update to use wrappers.shells and wrappers.snippets\n            drop wrappers usage\n1.4.0     * update wrappers to v24.8.29\n          * update sequana_pipetools requirement to \u003e=1.5\n1.3.1     * remove temp on BWA BAM file (more practical to keep them)\n1.3.0     * uses new sequana_coverage wrapper\n1.2.1     * fix bwa_split bwa aggreate stage (bug fix)\n1.2.0     * Implement a bwa_split method to speed up mapping of very large\n            fastq files.\n1.1.0     * BAM files are now filtered to remove unmapped reads\n          * set wrappers branch in config file and update pipeline.\n          * refactorise to use click and new sequana-pipetools\n1.0.0     * Use latest sequana-wrappers and graphviz apptainer\n0.12.0    * Use latest pipetools and add singularity containers\n0.11.1    * Fix typo when setting coverage to True and allow untagged filenames\n0.11.0    * implement feature counts for capture-seq projects\n0.10.1    * remove getlogdir and getname\n0.10.0    * use new wrappers framework\n0.9.0     * fix issue with logger and increments requirements\n          * add new option --pacbio to automatically set the options for\n            pacbio data (-x map-pb and readtag set to None)\n0.8.13    * add the thread option in minimap2 case\n0.8.12    * factorise multiqc rule\n0.8.11    * Implemente the --from-project option and new framework\n          * custom HTMrLl report\n0.8.10    * change samtools_depth rule and switched to bam2cov to cope with null\n            coverage\n0.8.9     * fix requirements\n0.8.8     * fix pipeline rule for bigwig + renamed output_bigwig into\n            create_bigwig; fix the multiqc config file\n0.8.7     * fix config file creation (for bigwig)\n0.8.6     * added bowtie2 mapper + bigwig as output, make coverage optional\n0.8.5     * create a sym link to the HTML report. Better post cleaning.\n0.8.4     * Fixing multiqc (synchronized with sequana updates)\n0.8.3     * add sequana_coverage rule.\n0.8.2     * add minimap2 mapper\n0.8.1     * fix bamtools stats rule to have different output name for multiqc\n0.8.0     **First release.**\n========= ======================================================================\n\n\nContribute \u0026 Code of Conduct\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo contribute to this project, please take a look at the\n`Contributing Guidelines \u003chttps://github.com/sequana/sequana/blob/main/CONTRIBUTING.rst\u003e`_ first. Please note that this project is released with a\n`Code of Conduct \u003chttps://github.com/sequana/sequana/blob/main/CONDUCT.md\u003e`_. By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fmapper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsequana%2Fmapper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fmapper/lists"}