{"id":42189088,"url":"https://github.com/lucas-diedrich/snakemake-learning","last_synced_at":"2026-01-26T22:35:03.336Z","repository":{"id":295384402,"uuid":"989187087","full_name":"lucas-diedrich/snakemake-learning","owner":"lucas-diedrich","description":"GitHub Repository for the snakemake learn session at the @MannLabs Group Retreat 2025","archived":false,"fork":false,"pushed_at":"2025-07-11T09:25:06.000Z","size":1109,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-05T08:53:57.686Z","etag":null,"topics":["tutorial"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucas-diedrich.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-23T17:40:55.000Z","updated_at":"2025-07-11T09:25:09.000Z","dependencies_parsed_at":"2025-05-25T09:26:35.923Z","dependency_job_id":"10220e89-599b-4ab8-b208-442448be0d89","html_url":"https://github.com/lucas-diedrich/snakemake-learning","commit_stats":null,"previous_names":["lucas-diedrich/snakemake-learning"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/lucas-diedrich/snakemake-learning","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucas-diedrich%2Fsnakemake-learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucas-diedrich%2Fsnakemake-learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucas-diedrich%2Fsnakemake-learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucas-diedrich%2Fsnakemake-learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucas-diedrich","download_url":"https://codeload.github.com/lucas-diedrich/snakemake-learning/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucas-diedrich%2Fsnakemake-learning/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28790065,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-26T21:49:50.245Z","status":"ssl_error","status_checked_at":"2026-01-26T21:48:29.455Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["tutorial"],"created_at":"2026-01-26T22:35:02.584Z","updated_at":"2026-01-26T22:35:03.305Z","avatar_url":"https://github.com/lucas-diedrich.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# snakemake-learning\nGitHub Repository for the hands-on snakemake learn session at the MannLabs Group Retreat 2025\n\nSnakemake is a python-based workflow manager that is supposed to make your life easier when analysing large datasets. It **enforces reproducibility** and **enables scalability**. \n\n### Tutorial overview\n\nIn this tutorial, we will \n1. read in a dataset (here: a small image)\n2. process it with a simple function (here: apply different image transformations to it)\n3. generate a plot as output (here: histograms of pixel intensities)\n4. generate a snakemake report.\n\n\n![Results](./docs/img/results.png)\n\n\n## Installation \n\n1. Using the command line, go into your favorite directory (`cd /path/to/my/favorite/directory`)\n\n2. Clone this repository \n\n```shell \ngit clone https://github.com/lucas-diedrich/snakemake-learning.git\n```\n\n(or download it via `Code \u003e Download ZIP`, and unzip it locally)\n\n3. Go into the directory\n\n```shell \ncd snakemake-learning\n```\n\n4. Create a `mamba`/`conda` environment with snakemake based on the `environment.yaml` file and activate it\n\n```shell \nmamba create -n snakemake-env --file environment.yaml \u0026\u0026 mamba activate snakemake-env\n\n# OR conda env create -f environment.yaml \u0026\u0026 conda activate snakemake-env\n```\n\n5. Check if the installation was successful\n\n```shell\nsnakemake --version\n\u003e 9.5.1\n```\n\n## Tutorial\n\n### 1. Snakemake - Introduction \n\nSee the slides in `./docs`\n\n### 2. Check out the workflow \n\nRun the following command in the root directory (`.`) to se the whole task graph. \n\n```shell\n# --dag: Directed acyclic graph\nsnakemake --dag \n```\n\nAnd the following command to inspect how the rules depend on one another (simpler than task graph, especially for large workflows)\n\n```shell\n# --rulegraph: Show dependencies between rules\nsnakemake --rulegraph\n```\n\n```mermaid \n---\ntitle: Rule Graph\n---\nflowchart TB\n        id0[all]\n        id1[plot_histogram]\n        id2[transform_image]\n        id3[save_image]\n        style id0 fill:#CD5C5C,stroke-width:2px,color:#333333\n        style id1 fill:#F08080,stroke-width:2px,color:#333333\n        style id2 fill:#FA8072,stroke-width:2px,color:#333333\n        style id3 fill:#E9967A,stroke-width:2px,color:#333333\n        id0 --\u003e id0\n        id1 --\u003e id0\n        id2 --\u003e id1\n        id3 --\u003e id2\n```\n\nYou can use this [`grapviz visualizer`](https://dreampuf.github.io/GraphvizOnline/) editor to view the task graph\n\n\n### 3. Run the full workflow \n\nGo in the `./workflow` directory and run:\n\n```shell\nsnakemake --cores 2 --use-conda\n```\n\nThe output can be found in the `./results` directory\n\n### Generate the report \n\nGo in the `./workflow` directory and run \n\n```shell\nsnakemake --report ../results/report.html\n```\n\nThe output can be found in the `./results` directory\n\n\n\n## Run on a slurm HPC cluster\nYou can run this workflow on an high-performance computing cluster (_here leveraging the slurm manager_). In this case, one slurm job acts as a scheduler that submits individual rule executions as separate slurm jobs. The `snakemake-executor-plugin-slurm` automatically handles the scheduling and submission of dependent jobs. Please checkout the script `/workflow/snakemake.sbatch` and the official [snakemake slurm plugin documentation](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html#snakemake-executor-plugin-slurm) to learn more about the relevant flags and settings.\n\n\n### Execution\n\nInstall the environment \n\n```\nconda create -n snakemake-env -y\nconda env update --n snakemake-env --file environment.yaml\n```\n\nAdditionally install the `snakemake-executor-plugin-slurm`:\n\n```shell\npip install snakemake-executor-plugin-slurm\n```\n\nThen submit the provided workflow script on a cluster\n\n```shell\ncd /workflow/\nsbatch snakemake.sbatch\n```\n\n\n## Exercises \n\n*To further deepen your understanding after the workshop.*\n\n### 1. Scale the workflow to other images \n\nThe script `create-data.py` can take image names (that are part of the `skimage` package) as arguments. \n\n```shell\npython scripts/create-data.py --image-name \u003cimage name\u003e --output \u003coutput name\u003e\n```\nModify the workflow in a way that it also (=in addition) runs on other `skimage` example datasets, e.g. `colorwheel, cat, logo`\n\n### 2. Add a rule \n\nAdd a new rule in which you generate an aggregated plot - where the image and its modifications are shown in the top row and the associated histograms are shown in the bottom row. \n\n\n### 3. Prettify the report\n\nExplore possibilities to modify the report with the rich structured text format. \n\n\n## References\n\n- **Snakemake homepage + Documentation** [snakemake.readthedocs.io](https://snakemake.readthedocs.io/en/stable/index.html)\n\n- **Publication** Mölder F, Jablonski KP, Letcher B et al. Sustainable data analysis with Snakemake [version 2; peer review: 2 approved]. F1000Research 2021, 10:33 (https://doi.org/10.12688/f1000research.29032.2)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucas-diedrich%2Fsnakemake-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucas-diedrich%2Fsnakemake-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucas-diedrich%2Fsnakemake-learning/lists"}