Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/timtroendle/cookiecutter-reproducible-research
A cookiecutter template for reproducible research projects using Python, Snakemake, and Pandoc.
https://github.com/timtroendle/cookiecutter-reproducible-research
automation cluster-computing cookiecutter-template pandoc python reproducible-research snakemake
Last synced: about 1 month ago
JSON representation
A cookiecutter template for reproducible research projects using Python, Snakemake, and Pandoc.
- Host: GitHub
- URL: https://github.com/timtroendle/cookiecutter-reproducible-research
- Owner: timtroendle
- License: mit
- Created: 2017-10-19T15:57:01.000Z (about 7 years ago)
- Default Branch: main
- Last Pushed: 2024-05-01T08:47:17.000Z (8 months ago)
- Last Synced: 2024-08-03T22:18:42.242Z (5 months ago)
- Topics: automation, cluster-computing, cookiecutter-template, pandoc, python, reproducible-research, snakemake
- Language: CSS
- Homepage:
- Size: 87.9 KB
- Stars: 20
- Watchers: 2
- Forks: 5
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - timtroendle/cookiecutter-reproducible-research - A cookiecutter template for reproducible research projects using Python, Snakemake, and Pandoc. (CSS)
README
![Reproduction](https://github.com/timtroendle/cookiecutter-reproducible-research/actions/workflows/reproduction.yaml/badge.svg)
# cookiecutter-reproducible-research
This repository provides [cookiecutter](http://cookiecutter.readthedocs.io) templates for reproducible research projects. The templates do not attempt to be generic, but have a clear and opinionated focus.
Projects build with these templates aim at full automation, and use `Python 3.11`, `mamba/conda`, `Git`, `Snakemake`, and `pandoc` to create a HTML report out of raw data, code, and `Markdown` text. Fork, clone, or download this repository on GitHub if you want to change any of these.
The template includes a few lines of code as a demo to allow you to create a HTML report out of made-up simulation results right away. Read the `README.md` in the generated repository to see how.
## Template types
> default
This generates the basic structure of a reproducible workflow.
> cluster
The cluster template extends the basic template by adding infrastructure to support running on a compute cluster.
## Getting Started
Make sure you have cookiecutter installed, otherwise install it with [conda](https://conda.io/docs/index.html):
conda install cookiecutter -c conda-forge
Then create a repository using:
cookiecutter gh:timtroendle/cookiecutter-reproducible-research --directory=[default/cluster]
You will be asked for the following parameters:
Parameter | Description
--- | ---
`project_name` | The name of your project, used in the documentation and report.
`project_short_name` | An abbreviation, used for environments and such. Avoid special characters and whitespace.
`author` | Your name.
`institute` | The name of your institute, used for report metadata.
`short_description` | A short description of the project, used for documentation and report.
`path_to_conda_envs` | The path to the directory hosting your conda envs (leave untouched for Snakemake default).The `cluster` template requires the following parameter values in addition:
Parameter | Description
--- | ---
`cluster_url` | The address of the cluster to allow syncing to and from the cluster.
`cluster_base_dir` | The base path for the project on the cluster (default: `~/`).
`cluster_type` | The type of job scheduler used on the cluster. Currently, only LSF is supported.## Project Structure
The generated repository will have the following structure:
```
├── config <- Configuration files, e.g., for your model if needed.
│ └── default.yaml <- Default set of configuration parameter values.
├── data <- Raw input data.
├── envs <- Execution environments.
│ ├── default.yaml <- Default execution environment.
│ ├── report.yaml <- Environment for compilation of the report.
│ └── test.yaml <- Environment for executing tests.
├── profiles <- Snakemake profiles.
│ └── default <- Default Snakemake profile folder.
│ └── config.yaml <- Default Snakemake profile.
├── report <- All files creating the final report, usually text and figures.
│ ├── apa.csl <- Citation style definition to be used in the report.
│ ├── literature.yaml <- Bibliography file for the report.
│ ├── report.md <- The report in Markdown.
│ └── pandoc-metadata.yaml<- Metadata for the report.
├── rules <- The place for all your Snakemake rules.
├── scripts <- Scripts go in here.
│ ├── model.py <- Demo file.
│ └── vis.py <- Demo file.
├── tests <- Automatic tests of the source code go in here.
│ └── test_model.py <- Demo file.
├── .editorconfig <- Editor agnostic configuration settings.
├── .ruff <- Linter and formatter settings for ruff.
├── .gitignore
├── environment.yaml <- A file to create an environment to execute your project in.
├── LICENSE.md <- MIT license description
├── Snakefile <- Description of all computational steps to create results.
└── README.md
````cluster` templates additionally contain the following files:
```
├── envs
│ └── shell.yaml <- An environment for shell rules.
├── profiles
│ └── cluster <- Cluster Snakemake profile folder.
│ └── config.yaml <- Cluster Snakemake profile.
├── rules
│ └── sync.yaml <- Snakemake rules to sync to and from the cluster.
├── .syncignore-receive <- Build files to ignore when receiving from the cluster.
└── .syncignore-send <- Local files to ignore when sending to the cluster.
```## License
Some ideas for this cookiecutter template are taken from [cookiecutter-data-science](http://drivendata.github.io/cookiecutter-data-science/) and [mkrapp/cookiecutter-reproducible-science](https://github.com/mkrapp/cookiecutter-reproducible-science). This template is MIT licensed itself.