https://github.com/juke34/biten
A Bioinformatics Nextflow Pipeline Template
https://github.com/juke34/biten
bioinformatics nextflow pipelines template training
Last synced: 4 months ago
JSON representation
A Bioinformatics Nextflow Pipeline Template
- Host: GitHub
- URL: https://github.com/juke34/biten
- Owner: Juke34
- License: mit
- Created: 2024-03-15T09:46:10.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-14T13:00:32.000Z (about 1 year ago)
- Last Synced: 2025-10-09T04:11:27.421Z (8 months ago)
- Topics: bioinformatics, nextflow, pipelines, template, training
- Language: Nextflow
- Homepage:
- Size: 2.5 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README

Bioinformatics Template in Nextflow
BiTeN is a short pipeline written in nextflow that aims to be used as a template for nextflow pipeline development.
Nextflow is a free, open source software project that facilitates the execution of a computational workflow consisting of a series of interconnected steps/tasks. Utilizing Nextflow can take various forms. This repository offers a specific example illustrating how a bioinformatician can organize their code to be executed using Nextflow.
## Table of Contents
* [Foreword](#foreword)
* [Project layout](#project-layout)
* [Helping to develop](#helping-to-develop)
* [Installation](#installation)
* [BiTeN](#biten)
* [Nextflow](#nextflow)
* [Container platform](#container-platform)
* [Docker](#docker)
* [Singularity](#singularity)
* [Usage and test](#usage)
* [Parameters](#parameters)
* [Uninstall](#uninstall)
* [Contributing](#contributing)
* [Report bugs and issues](#report-bugs-and-issues)
* [How to cite?](#how-to-cite)
* [Acknowledgement](#acknowledgement)
## Foreword
The pipeline and the whole repository (readme/Contributing/etc) can be use as template for nextflow pipeline projects.
Comment in pipeline's code help the user to better understand the different usages.
This pipeline template follow the following steps:
- handling parameters, file input and help (Deal with gz and not gz file, deal with paired and unpaired input reads, etc.)
- QC
- Alignment
- file conversion (sam2bam)
- file sorting (samtools_sort)
### Project layout
BiTeN/
├── README.md # Documentation that gives users a detailed description of a project and with guidelines on how to use it.
├── LICENSE # Lience of your projet. Licenses are important for open-source projects because they set the legal terms and conditions for using, distributing, and modifying the software
├── CONTRIBUTING.md # Provides potential project contributors with a short guide to how they can help with your project
├── img # Folder containing images used by the README
|
| // FROM HERE IT IS RELATED TO THE NEXTFLOW PIPELINE
|
├── main.nf # The nextflow main executable file use to run your pipeline. It contains the logic of your pipeline
├── modules/ # Modules folder contains components that can be included in workflows. Think as functions in programming languages. Module were introduced in DSL2. See https://www.nextflow.io/docs/latest/module.html It is encouraged to have a module file by tool.
│ ├── bowtie2.nf # A module file containings processes (the basic processing primitive to execute a user script see https://www.nextflow.io/docs/latest/process.html#processes) related to the bowtie2 tool.
│ ├── fastqc.nf # A module file containings processes related to the fastqc tool.
│ ├── samtools.nf # A module file containings processes related to the samtools tool.
│ └── template.nf # A template module file.
├── subworkflows/ # Subworkflows folder contains workflow components that can be included in other workflows, typically used by the main workflow in the main.nf
├── nextflow.config # Configuration file. Nextflow has multiple way to handle config ((see here)[https://www.nextflow.io/docs/latest/config.html#configuration-file]). We can define it this file, parameters, profiles, etc.
├── ressources/ # Contains configuration files that define the differents ressources i.e. computing and tools
│ ├── computing/ # Contains configuration files that define the computing ressources that will be loaded via profiles
│ │ ├── hpc.config # A hpc configuration that define computing ressource on HPC (CPU, TimeOut, RAM per process/label and other information like parallelisation and scheduler)
│ │ └── local.config # A local configuration that define computing ressource on local machine (CPU, TimeOut, RAM per process/label).
│ └── softwares.config # A software configuration that define where Nextflow have to fetch the container of each tool.
└── test # Folder containing a test data set
├── reads.fastq.gz
└── genome.fa
### Helping to develop
**Documentation**
* [Nextflow Documentation](https://www.nextflow.io/docs/latest/index.html) The official nextflow documentation very well written. Do not hesitate to extensively use the search bar!
* [Nextflow Cheat Sheet](https://github.com/danrlu/nextflow_cheatsheet/blob/main/nextflow_cheatsheet.pdf) A nice nextflow cheat sheet made by @dabrlu .
* [Basic pipeline example from nextflow.io](https://www.nextflow.io/example1.html) Do not hesitate to look at other examples
**Training**
* [Nextflow Training from Fundamentals to Advanced](https://training.nextflow.io/latest/) The offical trainings from nextflow.io.
* [Nextflow workshop for beginners - by vagkaratzas](https://github.com/vagkaratzas/nextflow-beginners-workshop)
* [nf-core Training - by Zemzemfiras1](https://github.com/Zemzemfiras1/nf-core-pre-hackathon_training2025)
* [Software Carpentry Nextflow training](https://carpentries-incubator.github.io/workflows-nextflow/index.html). High quality course made by the [Software Carpentry](https://software-carpentry.org).
* [Bioinformatics Workshop on Tools for Reproducible Research - Nextflow](https://southgreenplatform.github.io/training_reproducible_research/pages/nextflow/nextflow-1-introduction/) A course based on [NBIS material](https://github.com/NBISweden/workshop-reproducible-research) to learn Nextflow basics.
**Community**
* [Nextflow Slack](https://www.nextflow.io/slack-invite.html)
* [Nf-core](https://nf-co.re/) A community effort to collect a curated set of analysis pipelines built using Nextflow.
* [Seqera Community](https://community.seqera.io/)
**Monitoring Nextflow Pipeline**
* [nf-tower](https://github.com/seqeralabs/nf-tower)
* [nf-shard](https://github.com/gluonfield/nf-shard)
## Installation
The prerequisites to run the pipeline are:
* The BiTeN repository
* [Nextflow](https://www.nextflow.io/) >= 22.04.0
* [Docker](https://www.docker.com) or [Singularity](https://sylabs.io/singularity/)
### BiTeN
```bash
# clone the workflow repository
git clone https://github.com/Juke34/BiTeN.git
# Move in it
cd BiTeN
```
### Nextflow
* Via conda
See here
```
conda create -n nextflow
conda activate nextflow
conda install nextflow
```
* Manually
See here
Nextflow runs on most POSIX systems (Linux, macOS, etc) and can typically be installed by running these commands:
```
# Make sure 11 or later is installed on your computer by using the command:
java -version
# Install Nextflow by entering this command in your terminal(it creates a file nextflow in the current dir):
curl -s https://get.nextflow.io | bash
# Add Nextflow binary to your user's PATH:
mv nextflow ~/bin/
# OR system-wide installation:
# sudo mv nextflow /usr/local/bin
```
### Container platform
To run the workflow you will need a container platform: docker or singularity.
### Docker
Please follow the instructions at the [Docker website](https://docs.docker.com/desktop/)
### Singularity
Please follow the instructions at the [Singularity website](https://docs.sylabs.io/guides/latest/admin-guide/installation.html)
## Usage
You can first check the available options and parameters by running:
`nextflow run BiTeN.nf --help`
To run the workflow you must select a profile according to the container platform you want to use:
- `singularity`, a profile using Singularity to run the containers
- `docker`, a profile using Docker to run the containers
The command will look like that:
```
nextflow run main.nf -profile docker
```
Another profile is available (/!\\Work in progress):
- `slurm`, to add if your system has a slurm executor (local by default)
The use of the `slurm` profile will give a command like this one:
```
nextflow run main.nf -profile docker,slurm
```
## Test the workflow
Test data are included in the BiTeN repository in the `test` folder.
A typical command to run a test on single end data will look like that:
```
nextflow run -profile local,docker main.nf --genome test/genome.fa --reads test --single_end true
```
On success you should get a message looking like this:
```
BiTeN Pipeline execution summary
--------------------------------------
Completed at : 2024-03-07T21:40:23.180547+01:00
UUID : e2a131e3-3652-4c90-b3ad-78f758c06070
Duration : 8.4s
Success : true
Exit Status : 0
Error report : -
```
## Parameters
| Parameter | Comment |
| --- | --- |
| --help | prints the help section |
| --reads | path to the directory containing the reads |
| --pattern_reads | pattern to match the read files. In the case of single end data it would looks like: "*.fastq.gz". In the case of paired end data it would looks like: "*_{R1,R2}_001.fastq.gz" or "*_{1,2}.fastq.gz" |
| --single_end | Boolean to inform if we have a single end or paired end data. |
| --stranded | Boolean to inform if we have a single or stranded data. |
| --genome | path to the genome file in fasta format. |
| --bowtie2_options | Parameter to tune the bowtie2 aligner behaviour. |
## Contributing
We welcome contributions from the community! See our [Contributing guidelines](https://github.com/Juke34/BiTeN/blob/main/CONTRIBUTING.md)
## Report bugs and issues
Found a bug or have a question? Please open an [issue](https://github.com/Juke34/BiTeN/issues).
## How to cite?
If you use this template for your developement please cite or acknowledge e.g.
Development based on the BiTeN template (https://github.com/Juke34/BiTeN), Dainat J.
# Acknowledgement
Jacques Dainat (@Juke34)
Juliette Hayer (@jhayer)
Mahesh Binzer-Panchal (@mahesh-panchal)