{"id":18448821,"url":"https://github.com/sequana/fastqc","last_synced_at":"2025-04-08T01:32:38.554Z","repository":{"id":43841435,"uuid":"223491618","full_name":"sequana/fastqc","owner":"sequana","description":"sequana pipeline to perform parallel fastqc and summarize results with multiqc plot","archived":false,"fork":false,"pushed_at":"2025-01-08T21:30:26.000Z","size":2164,"stargazers_count":4,"open_issues_count":2,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-21T15:19:04.717Z","etag":null,"topics":["fastqc","ngs","sequana","snakemake"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sequana.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-22T21:45:07.000Z","updated_at":"2025-01-16T03:00:46.000Z","dependencies_parsed_at":"2024-04-18T16:55:16.615Z","dependency_job_id":"adcc2e92-db08-4b48-9e43-e5c25d604ee5","html_url":"https://github.com/sequana/fastqc","commit_stats":{"total_commits":153,"total_committers":3,"mean_commits":51.0,"dds":0.1633986928104575,"last_synced_commit":"19e60bbe907387be500507e630b7f014d2a7b4de"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Ffastqc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Ffastqc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Ffastqc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sequana%2Ffastqc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sequana","download_url":"https://codeload.github.com/sequana/fastqc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247760628,"owners_count":20991517,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastqc","ngs","sequana","snakemake"],"created_at":"2024-11-06T07:17:23.463Z","updated_at":"2025-04-08T01:32:38.209Z","avatar_url":"https://github.com/sequana.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n.. image:: https://badge.fury.io/py/sequana-fastqc.svg\n     :target: https://pypi.python.org/pypi/sequana_fastqc\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n    :target: http://joss.theoj.org/papers/10.21105/joss.00352\n    :alt: JOSS (journal of open source software) DOI\n\n.. image:: https://github.com/sequana/fastqc/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/fastqc/actions/workflows/main.yml\n\n\n.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg\n    :target: https://pypi.python.org/pypi/sequana\n    :alt: Python 3.8 | 3.9 | 3.10\n\nThis is is the **fastqc** pipeline from the `Sequana \u003chttps://sequana.readthedocs.org\u003e`_ projet\n\n:Overview: Runs fastqc and multiqc on a set of Sequencing data to produce control quality reports\n:Input: A set of FastQ files (paired or single-end) compressed or not\n:Output: An HTML file summary.html (individual fastqc reports, mutli-samples report)\n:Status: Production\n:Wiki: https://github.com/sequana/fastqc/wiki\n:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io\n:Citation: Cokelaer et al, (2017), 'Sequana': a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352\n\n\nInstallation\n~~~~~~~~~~~~\n\nsequana_fastqc is based on Python3, just install the package as follows::\n\n    pip install sequana_fastqc --upgrade\n\nYou will need third-party software such as fastqc. Please see below for details.\n\nUsage\n~~~~~\n\nIf you have a set of FastQ files in a data/ directory, type::\n\n    sequana_fastqc --input-directory data\n\nTo know more about the options (e.g., add a different pattern to restrict the\nexecution to a subset of the input files, change the output/working directory,\netc)::\n\n    sequana_fastqc --help\n\nThe call to sequana_fastqc creates a directory **fastqc**. Then, you go to the \nworking directory and execute the pipeline as follows::\n\n    cd fastqc\n    sh fastqc.sh  # for a local run\n\nThis launch a snakemake pipeline. If you are familiar with snakemake, you can retrieve the fastqc.rules and config.yaml files and then execute the pipeline yourself with specific parameters::\n\n    snakemake -s fastqc.rules --cores 4 --stats stats.txt\n\nOr use `sequanix \u003chttps://sequana.readthedocs.io/en/master/sequanix.html\u003e`_ interface.\n\nPlease see the `Wiki \u003chttps://github.com/sequana/fastqc/wiki\u003e`_ for more examples and features.\n\nTutorial\n~~~~~~~~\n\nYou can retrieve test data from sequana_fastqc (https://github.com/sequana/fastqc) or type::\n\n    wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R1_001.fastq.gz\n    wget https://raw.githubusercontent.com/sequana/fastqc/master/sequana_pipelines/fastqc/data/data_R2_001.fastq.gz\n\nthen, prepare the pipeline::\n\n    sequana_fastqc --input-directory .\n    cd fastqc\n    sh fastq.sh\n\n    # once done, remove temporary files (snakemake and others)\n    make clean\n\nJust open the HTML entry called summary.html. A multiqc report is also available. \nYou will get expected images such as the following one:\n\n.. image:: https://github.com/sequana/fastqc/blob/main/doc/summary.png?raw=true\n\nPlease see the `Wiki \u003chttps://github.com/sequana/fastqc/wiki\u003e`_ for more examples and features.\n\nRequirements\n~~~~~~~~~~~~\n\nThis pipelines requires the following executable(s):\n\n- fastqc\n- falco (optional)\n\n\nFor Linux users, we provide apptainer/singularity images available through the **damona** project (https://damona.readthedocs.io). \n\nTo make use of them, initiliase the pipeline with the --use-apptainer option and everything should be downloaded\nautomatically for you, which also guarantees reproducibility::\n\n    sequana_fastqc --input-directory data --use-apptainer --apptainer-prefix ~/images\n\n\n.. image:: https://raw.githubusercontent.com/sequana/fastqc/main/sequana_pipelines/fastqc/dag.png\n\n\nDetails\n~~~~~~~~~\n\nThis pipeline runs fastqc in parallel on the input fastq files (paired or not)\nand then execute multiqc. A brief sequana summary report is also produced.\ns\nYou may use falco instead of fastqc. This is experimental but seem to work for\nIllumina/FastQ files.\n\nThis pipeline has been tested on several hundreds of MiSeq, NextSeq, MiniSeq,\nISeq100, Pacbio runs.\n\nIt produces a md5sum of your data. It copes with empty samples. Produces\nready-to-use HTML reports, etc\n\n\nRules and configuration details\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere is the `latest documented configuration file \u003chttps://raw.githubusercontent.com/sequana/fastqc/main/sequana_pipelines/fastqc/config.yaml\u003e`_\nto be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file. \n\nChangelog\n~~~~~~~~~\n========= ====================================================================\nVersion   Description\n========= ====================================================================\n1.8.2     * Fix the onerror typo in the pipeline, fix CI.\n1.8.1     * update __init__ (version)\n1.8.0     * uses pyproject instead of setuptools\n          * uses click instead of argparse and newest sequana_pipetools \n            (0.16.0)\n1.7.1     * Set wrapper version in the config based on new sequana_pipetools\n            feature\n1.7.0     * Use new rulegraph wrapper and new graphviz apptainer\n1.6.2     * slight refactorisation to use rulegraph wrapper\n1.6.1     * pin sequana version to 1.4.4 to force usage of new fastqc module\n            to fix falco. Updated config documentation.\n1.6.0     * Fixed falco output error and use singularity containers\n1.5.0     * removed modules completely.\n1.4.2     * simplified pipeline (suppress setup and use existing wrapper)\n1.4.1     * simplified pipeline with wrappers/rules\n1.4.0     * This version uses sequana 0.12.0 and new sequana-wrappers \n            mechanism. Functionalities is unchanged. Also based on\n            sequana_pipetools 0.6.X\n1.3.0     * add option --skip-multiqc (in case of memory trouble)\n          * Fix typo in the link towards fastqc reports in the summary.html\n            table\n          * Fix number of samples in the paired case (divide by 2)\n1.2.0     * compatibility with Sequanix\n          * Fix pipeline to cope with new snakemake API\n1.1.0     * add new rule to allow users to choose falco software instead of\n            fastqc. Note that fastqc is 4 times faster but still a work in\n            progress (version 0.1 as of Nov 2020).\n          * allows the pipeline to process pacbio files (in fact any files\n            accepted by fastqc i.e. SAM and BAM files\n          * More doc, test and info on the wiki\n1.0.1     * add md5sum of input files as md5.txt file\n1.0.0     * a stable version. Added a wiki on github as well and a \n            singularity recipes\n0.9.15    * For the HTML reports, takes into account samples with zero reads\n0.9.14    * round up some statistics in the main table \n0.9.13    * improve the summary HTML report\n0.9.12    * implemented new --from-project option\n0.9.11    * now depends on sequana_pipetools instead of sequana.pipelines to \n            speed up --help calls\n          * new summary.html report created with pipeline summary\n          * new rule (plotting)\n0.9.10    * simplify the onsuccess section\n0.9.9     * add missing png and pipeline (regression bug)\n0.9.8     * add missing multi_config file\n0.9.7     * check existence of input directory in main.py\n          * add a logo \n          * fix schema\n          * add multiqc_config\n          * add sequana + sequana_fastqc version\n0.9.6     add the readtag option\n========= ====================================================================\n\n\nContribute \u0026 Code of Conduct\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo contribute to this project, please take a look at the \n`Contributing Guidelines \u003chttps://github.com/sequana/sequana/blob/master/CONTRIBUTING.rst\u003e`_ first. Please note that this project is released with a \n`Code of Conduct \u003chttps://github.com/sequana/sequana/blob/master/CONDUCT.md\u003e`_. By contributing to this project, you agree to abide by its terms.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Ffastqc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsequana%2Ffastqc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsequana%2Ffastqc/lists"}