https://github.com/dienerlab/pipelines
A monorepo containing all of our Nextflow pipelines.
https://github.com/dienerlab/pipelines
metagenomics metatranscriptomics microbiome nextflow
Last synced: 10 days ago
JSON representation
A monorepo containing all of our Nextflow pipelines.
- Host: GitHub
- URL: https://github.com/dienerlab/pipelines
- Owner: dienerlab
- License: apache-2.0
- Created: 2024-03-03T12:55:25.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2026-06-01T08:37:00.000Z (11 days ago)
- Last Synced: 2026-06-01T10:21:54.262Z (11 days ago)
- Topics: metagenomics, metatranscriptomics, microbiome, nextflow
- Language: Nextflow
- Homepage: https://dienerlab.github.io/pipelines
- Size: 1.2 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[](https://github.com/dienerlab/pipelines/actions/workflows/lint.yml)
[](https://dienerlab.github.io/pipelines)
# :hammer: :triangular_ruler: Pipelines
This repo contains various analysis pipelines for the lab. Here are the basic
rules:
- each folder includes pipelines for a particular analysis - data type combination
- pipelines are [nextflow](https://www.nextflow.io/) workflows
- each pipeline comes with a list of conda environment files that manage the required software
## Data layout
Pipelines will usually operate from a top level project
directory structured in the following way:
```
[project root]
├─ [pipeline].nf
├─ data
│ ├─ raw
│ │ ├─ sample1_R1.fastq.gz
│ │ ├─ sample1_R2.fastq.gz
│ │ └─ ...
│ ├─ figures
│ ├─ fig1.png
│ │ └─ ...
│ └─ ...
└─ refs
├─ eggnog
└─ kraken2
```
The initial raw data lives in `data/raw` and all analysis artifacts should
be written into `data/` as well. Figures go into `figures/`.
## Setup
The first step is to copy or symlink the pipeline files into the top project
directory. After that you can set up a conda environment that includes all software
for the pipeline (please see individual pipelines for variations on that).
```bash
conda env create -f conda.yml
```
Either activate the environment (usualy named after the pipeline):
```bash
conda activate metagenomics
```
or run the pipeline with the `-with-conda /my/envs/metagenomics` option (required for HPC).
## Nextflow Configuration
You may also create a [nextflow config](https://www.nextflow.io/docs/latest/config.html) either in the project
directory as `nextflow.config` or in your user HOME as `~/.nextflow/config`. A template config is
[included in this repo](nextflow.config). If you are a lab member please use the [optimized
version from the wiki](https://github.com/dienerlab/internal/wiki/configs).
To install it as a global configuration:
On the server run the following to create the config directory
```bash
mkdir ~/.nextflow
```
After that edit and copy the config:
```bash
cp /path/to/pipelines/nextlow.config ~/.nextflow/config
```
Add in your token if you want to use [Nextflow Tower](https://tower.nf) to track your pipeline.
For slurm substitute the partition name `default` with the SLURM partition.
## Run the pipeline
After setup you can test the pipeline with
```bash
nextflow run [WORKFLOW].nf -profile local -resume
```
By default this will use all available 12 CPUs and 128 GB RAM unless specified otherwise in the personal [netxflow config](https://www.nextflow.io/docs/latest/config.html#scope-executor).