{"id":18448873,"url":"https://github.com/sequana/pacbio_qc","last_synced_at":"2026-04-17T23:10:05.917Z","repository":{"id":115174834,"uuid":"230936410","full_name":"sequana/pacbio_qc","owner":"sequana","description":"QC on pacbio data ","archived":false,"fork":false,"pushed_at":"2023-07-07T13:12:18.000Z","size":9936,"stargazers_count":1,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-25T02:42:12.264Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sequana.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-12-30T15:15:15.000Z","updated_at":"2023-03-29T13:18:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"0cf025b2-fd30-4a39-a2f4-58d9b528d9b1","html_url":"https://github.com/sequana/pacbio_qc","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fpacbio_qc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fpacbio_qc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fpacbio_qc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Fpacbio_qc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sequana","download_url":"https://codeload.github.com/sequana/pacbio_qc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239132704,"owners_count":19587107,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T07:17:31.923Z","updated_at":"2026-04-17T23:10:05.911Z","avatar_url":"https://github.com/sequana.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n.. image:: https://badge.fury.io/py/sequana-pacbio-qc.svg\n     :target: https://pypi.python.org/pypi/sequana_pacbio_qc\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n    :target: http://joss.theoj.org/papers/10.21105/joss.00352\n    :alt: JOSS (journal of open source software) DOI\n\n.. image:: https://github.com/sequana/pacbio_qc/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/pacbio_qc/actions/workflows    \n\n.. image:: https://img.shields.io/badge/python-3.11%20%7C%203.12-blue.svg\n    :target: https://pypi.python.org/pypi/sequana_pacbio_qc\n    :alt: Python 3.11 | 3.12\n\n\nThis is the **pacbio_qc** pipeline from the `Sequana \u003chttps://sequana.readthedocs.org\u003e`_ project\n\n:Overview: Quality control and analysis for PacBio long-read sequencing data (BAM files). Generates comprehensive statistics on read quality, length distribution, and GC content, with optional taxonomic classification.\n\n:Input: BAM files from PacBio sequencers (raw subreads, CCS, or processed reads)\n:Output: Per-sample HTML reports with interactive visualizations, quality metrics, and optional taxonomic classification; comprehensive summary report with all samples\n:Status: production\n:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io\n:Citation: Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI doi:10.21105/joss.00352\n\n\nInstallation\n~~~~~~~~~~~~\n\nInstall via pip::\n\n    pip install sequana_pacbio_qc\n\n**Optional dependencies:**\n\n- **kraken2**: For taxonomic classification (optional, disabled by default)\n- **graphviz**: For DAG visualization\n- **apptainer**: For containerized execution of tools\n\n\nQuick Start\n~~~~~~~~~~~\n\n::\n\n    # Display help\n    sequana_pacbio_qc --help\n\n    # Create pipeline in current directory\n    sequana_pacbio_qc --input-directory /path/to/bam/files\n\n    # With optional Kraken taxonomy\n    sequana_pacbio_qc --input-directory /path/to/bam/files --do-kraken --kraken-databases /path/to/kraken/db\n\n    # Using apptainer containers\n    sequana_pacbio_qc --input-directory /path/to/bam/files --apptainer-prefix ~/containers\n\nThis creates a ``pacbio_qc`` directory containing the pipeline and configuration files.\n\n\nExecution\n~~~~~~~~~\n\nExecute the pipeline::\n\n    cd pacbio_qc\n    bash pacbio_qc.sh\n\nOr with custom Snakemake parameters::\n\n    snakemake -s pacbio_qc.rules -c config.yaml --cores 4 --stats stats.txt\n\nOr use the `sequanix \u003chttps://sequana.readthedocs.io/en/master/sequanix.html\u003e`_ graphical interface.\n\n\nConfiguration\n~~~~~~~~~~~~~\n\nThe pipeline uses ``config.yaml`` to control:\n\n- **Input data**: BAM file directory and pattern matching\n- **Kraken**: Optional taxonomic database paths (disabled by default)\n- **MultiQC**: QC report options\n- **Apptainer**: Container image URLs (optional)\n\nPipeline Overview\n~~~~~~~~~~~~~~~~~~\n\n.. image:: https://raw.githubusercontent.com/sequana/pacbio_qc/master/sequana_pipelines/pacbio_qc/dag.png\n\n\nWorkflow Details\n~~~~~~~~~~~~~~~~\n\nThe pipeline performs the following analyses on PacBio BAM files:\n\n1. **Quality Metrics**: Computes read length statistics, GC content distribution, and signal-to-noise ratios\n2. **Visualizations**: Generates histograms and scatter plots for quality assessment\n3. **Per-Sample Reports**: Creates individual HTML reports for each sample with:\n\n   - Read length distribution histograms\n   - GC content analysis\n   - SNR (signal-to-noise ratio) metrics\n   - Quality overview with sample statistics\n\n4. **Taxonomy (Optional)**: Performs taxonomic classification using Kraken2 when enabled\n5. **Summary Report**: Generates a comprehensive HTML summary with:\n\n   - Overview of pipeline and all samples\n   - Summary statistics table with links to per-sample reports\n   - MultiQC aggregated quality metrics\n\n**Note:** Kraken2 databases are not provided with the pipeline. This step is optional and disabled by default.\n\n\nChangelog\n~~~~~~~~~\n========= ====================================================================\nVersion   Description\n========= ====================================================================\n1.0.1     HTML reports with pipeline overview; race condition handling for\n          parallel execution with --apptainer-prefix; improved CI/CD workflows\n1.0.0     Uses latest wrappers and graphviz apptainers\n0.11.0    Release to use latests sequana_pipetools framework\n0.10.0    Update to use latest tools from sequana framework\n0.9.0     First release of sequana_pacbio_qc using latest sequana rules and\n          modules (0.9.5)\n========= ====================================================================\n\n\nContribute \u0026 Code of Conduct\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo contribute to this project, please take a look at the \n`Contributing Guidelines \u003chttps://github.com/sequana/sequana/blob/main/CONTRIBUTING.rst\u003e`_ first. Please note that this project is released with a \n`Code of Conduct \u003chttps://github.com/sequana/sequana/blob/main/CONDUCT.md\u003e`_. By contributing to this project, you agree to abide by its terms.\n\n\nRules and configuration details\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere is the `latest documented configuration file \u003chttps://raw.githubusercontent.com/sequana/sequana_pacbio_qc/main/sequana_pipelines/pacbio_qc/config.yaml\u003e`_\nto be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file. \n\n\n\n.. |Codacy-Grade| image:: https://app.codacy.com/project/badge/Grade/9b8355ff642f4de9acd4b270f8d14d10\n   :target: https://www.codacy.com/gh/sequana/pacbio_qc/dashboard\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fpacbio_qc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsequana%2Fpacbio_qc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Fpacbio_qc/lists"}