{"id":13688681,"url":"https://github.com/ewels/clusterflow","last_synced_at":"2025-05-01T20:30:22.980Z","repository":{"id":17086279,"uuid":"19851423","full_name":"ewels/clusterflow","owner":"ewels","description":"A pipelining tool to automate and standardise bioinformatics analyses on cluster environments.","archived":true,"fork":false,"pushed_at":"2023-04-09T07:07:21.000Z","size":6831,"stargazers_count":97,"open_issues_count":11,"forks_count":27,"subscribers_count":14,"default_branch":"master","last_synced_at":"2024-08-03T15:11:47.572Z","etag":null,"topics":["bioinfomatics-pipeline","bioinformatics","clusterflow","perl","pipeline"],"latest_commit_sha":null,"homepage":"https://ewels.github.io/clusterflow/","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ewels.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"license.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2014-05-16T09:33:05.000Z","updated_at":"2024-01-22T03:20:57.000Z","dependencies_parsed_at":"2022-07-13T13:51:41.313Z","dependency_job_id":"5e5ab62b-9ccd-4bc7-a90c-c877accffd8d","html_url":"https://github.com/ewels/clusterflow","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ewels%2Fclusterflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ewels%2Fclusterflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ewels%2Fclusterflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ewels%2Fclusterflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ewels","download_url":"https://codeload.github.com/ewels/clusterflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224274647,"owners_count":17284620,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinfomatics-pipeline","bioinformatics","clusterflow","perl","pipeline"],"created_at":"2024-08-02T15:01:19.914Z","updated_at":"2025-05-01T20:30:22.973Z","avatar_url":"https://github.com/ewels.png","language":"Perl","funding_links":[],"categories":["Perl"],"sub_categories":[],"readme":"# \u003cimg src=\"docs/assets/Cluster_Flow_logo.png\" width=\"400\" title=\"Cluster Flow\"\u003e\n\n### A user-friendly bioinformatics workflow tool\n\n---\n\n# Cluster Flow is now archived\n\n_This project is no longer under active maintenance. You're welcome to use it, but no updates or bug fixes will be posted. We recommend using [Nextflow](https://nextflow.io/) together with [nf-core](https://nf-co.re/) instead._\n\n_Many thanks to everyone who used and supported Cluster Flow over the years._\n\n---\n\n\n[![Build Status](https://img.shields.io/travis/ewels/clusterflow.svg?style=flat-square)](https://travis-ci.org/ewels/clusterflow)\n[![Gitter](https://img.shields.io/badge/gitter-%20join%20chat%20%E2%86%92-4fb99a.svg?style=flat-square)](https://gitter.im/ewels/clusterflow)\n[![DOI](https://img.shields.io/badge/DOI-10.12688%2Ff1000research.10335.2-lightgrey.svg?style=flat-square)](http://dx.doi.org/10.12688/f1000research.10335.2)\n\n**Find Cluster Flow documentation with information and examples at\n[https://ewels.github.io/clusterflow/](https://ewels.github.io/clusterflow/)**\n\n---\n\nCluster Flow is a pipelining tool to automate and standardise\nbioinformatics analyses on high-performance cluster environments.\nIt is designed to be easy to use, quick to set up and flexible to configure.\n\nCluster Flow is written in Perl and works by launching jobs to a cluster\n(can also be run locally). Each job is a stand-alone Perl executable wrapper\naround a bioinformatics tool of interest.\n\nModules collect extensive logging information and Cluster Flow e-mails\nthe user with a summary of the pipeline commands and exit codes upon completion.\n\n## Installation\nYou can find stable versions to download on the\n[releases page](https://github.com/ewels/clusterflow/releases).\n\nYou can get the development version of the code by cloning this repository:\n```\ngit clone https://github.com/ewels/clusterflow.git\n```\n\nOnce downloaded and extracted, create a `clusterflow.config` file in the\nscript directory, based on `clusterflow.config.example`.\n\nNext, you need to add the main `cf` executable to your `PATH`. This can be done\nas an environment module, with a symlink to `bin` or by adding to your `~/.bashrc`\nfile.\n\nFinally, run the setup wizard (`cf --setup`) and genomes wizard (`cf --add_genome`) and\nyou're ready to go! See the [installation docs](docs/installation.md) for more\ninformation.\n\n## Usage\nPipelines are launched by naming a pipeline or module and the input files. A simple\nexample could look like this:\n```bash\ncf sra_trim *.fastq.gz\n```\n\nMost pipelines need reference genomes, and Cluster Flow has built in reference\ngenome management. Parameters can be passed to modify tool behaviour.\n\nFor example, to run the `fastq_bowtie` pipeline (FastQC, TrimGalore! and Bowtie)\nwith Human data, trimming the first 6bp of read 1, the command would be:\n\n```bash\ncf --genome GRCh37 --params \"clip_r1=6\" fastq_bowtie *.fastq.gz\n```\n\nAdditional common Cluster Flow commands are as follows:\n```bash\ncf --genomes     # List available reference genomes\ncf --pipelines   # List available pipelines\ncf --modules     # List available modules\ncf --qstat       # List running pipelines\ncf --qdel [id]   # Cancel jobs for a running pipeline\n```\n\n\n## Supported Tools\nCluster Flow comes with modules and pipelines for the following tools:\n\n| Read QC \u0026 pre-processing     | Aligners / quantifiers  | Post-alignment processing                               | Post-alignment QC                                                                                               |\n| ---------------------------- | ----------------------- | ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |\n| [FastQ Screen](fastqscreen)  | [Bismark](bismark)      | [bedtools](bedtools) (`bamToBed`, `intersectNeg`)       | [deepTools](deeptools) (`bamCoverage`, `bamFingerprint`)                                                        |\n| [FastQC](fastqc)             | [Bowtie 1](bowtie1)     | [subread featureCounts](featurecounts)                  | [MultiQC](multiqc)                                                                                              |\n| [TrimGalore!](trimgalore)    | [Bowtie 2](bowtie2)     | [HTSeq Count](htseq_count)                              | [phantompeaktools](phantompeaktools) (`runSpp`)                                                                 |\n| [SRA Toolkit](sratoolkit)    | [BWA](bwa)              | [Picard](picard) (`MarkDuplicates`)                     | [Preseq](preseq)                                                                                                |\n|                              | [HiCUP](hicup)          | [Samtools](samtools) (`bam2sam`, `dedup`, `sort_index`) | [RSeQC](rseqc) (`geneBody_coverage`, `inner_distance`, `junction_annotation`, `junction_saturation`, `read_GC`) |\n|                              | [HISAT2](hisat2)        |                                                         |                                                                                                                 |\n|                              | [Kallisto](kallisto)    |                                                         |                                                                                                                 |\n|                              | [STAR](star)            |                                                         |                                                                                                                 |\n|                              | [TopHat](tophat)        |                                                         |                                                                                                                 |\n\n## Citation\nPlease consider citing Cluster Flow if you use it in your analysis.\n\n\u003e **Cluster Flow: A user-friendly bioinformatics workflow tool [version 2; referees: 3 approved].** \u003cbr/\u003e\n\u003e Philip Ewels, Felix Krueger, Max Käller, Simon Andrews \u003cbr/\u003e\n\u003e _F1000Research_ 2016, **5**:2824 \u003cbr/\u003e\n\u003e doi: [10.12688/f1000research.10335.2](http://dx.doi.org/10.12688/f1000research.10335.2)\n\n```\n@article{Ewels2016,\nauthor = {Ewels, Philip and Krueger, Felix and K{\\\"{a}}ller, Max and Andrews, Simon},\ntitle = {Cluster Flow: A user-friendly bioinformatics workflow tool [version 2; referees: 3 approved].},\njournal = {F1000Research},\nvolume = {5},\npages = {2824},\nyear = {2016},\ndoi = {10.12688/f1000research.10335.2},\nURL = { + http://dx.doi.org/10.12688/f1000research.10335.2}\n}\n```\n\n## Contributions \u0026 Support\nContributions and suggestions for new features are welcome, as are bug reports!\nPlease create a new [issue](https://github.com/ewels/clusterflow/issues).\nCluster Flow has extensive\n[documentation](https://ewels.github.io/clusterflow/docs) describing how to write new modules\nand pipelines.\n\nThere is a chat room for the package hosted on Gitter where you can discuss\nthings with the package author and other developers:\nhttps://gitter.im/ewels/clusterflow\n\nIf in doubt, feel free to get in touch with the author directly:\n[@ewels](https://github.com/ewels) (phil.ewels@scilifelab.se)\n\n## Contributors\nProject lead and main author: [@ewels](https://github.com/ewels)\n\nCode contributions from:\n[@s-andrews](https://github.com/s-andrews),\n[@FelixKrueger](https://github.com/FelixKrueger),\n[@stu2](https://github.com/stu2),\n[@orzechoj](https://github.com/orzechoj)\n[@darogan](https://github.com/darogan)\nand others. Thanks for your support!\n\n## License\nCluster Flow is released with a GPL v3 licence. Cluster Flow is free software: you can\nredistribute it and/or modify it under the terms of the GNU General Public License as\npublished by the Free Software Foundation, either version 3 of the License, or (at your\noption) any later version. For more information, see the licence that comes bundled with\nCluster Flow.\n\n[bedtools]:          http://bedtools.readthedocs.io/en/latest/\n[bismark]:           http://www.bioinformatics.babraham.ac.uk/projects/bismark/\n[bowtie1]:           http://bowtie-bio.sourceforge.net/index.shtml\n[bowtie2]:           http://bowtie-bio.sourceforge.net/bowtie2/index.shtml\n[bwa]:               http://bio-bwa.sourceforge.net/\n[deeptools]:         https://deeptools.github.io/\n[fastqscreen]:       http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/\n[fastqc]:            http://www.bioinformatics.babraham.ac.uk/projects/fastqc/\n[featurecounts]:     http://bioinf.wehi.edu.au/featureCounts/\n[hicup]:             http://www.bioinformatics.babraham.ac.uk/projects/hicup/\n[hisat2]:            http://ccb.jhu.edu/software/hisat2/index.shtml\n[htseq_count]:       http://www-huber.embl.de/HTSeq/doc/count.html\n[kallisto]:          https://pachterlab.github.io/kallisto/\n[multiqc]:           http://multiqc.info\n[phantompeaktools]:  https://code.google.com/archive/p/phantompeakqualtools/\n[picard]:            https://broadinstitute.github.io/picard/\n[preseq]:            http://smithlabresearch.org/software/preseq/\n[rseqc]:             http://rseqc.sourceforge.net/\n[samtools]:          http://www.htslib.org/\n[sratoolkit]:        https://github.com/ncbi/sra-tools\n[star]:              https://github.com/alexdobin/STAR\n[tophat]:            http://ccb.jhu.edu/software/tophat/index.shtml\n[trimgalore]:        http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fewels%2Fclusterflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fewels%2Fclusterflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fewels%2Fclusterflow/lists"}