Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kinto-b/makepipe_example

A minimal example of an automated data processing pipeline
https://github.com/kinto-b/makepipe_example

Last synced: 20 days ago
JSON representation

A minimal example of an automated data processing pipeline

Awesome Lists containing this project

README

        

# Pipeline automation

This repository contains a minimal example of a data processing pipeline which
has been automated in two ways, first using `GNU Make` and second using
[`makepipe`](https://github.com/kinto-b/makepipe). In the former case, the
heavy lifting is being done by the [`Makefile`](Makefile); in the latter
case, by [`pipeline.R`](pipeline.R) (or alternatively [`pipeline_alt.R`](pipeline_alt.R)).

The pipeline, which I have lifted directly from Jenny Bryan's [STAT545 course](https://stat545.com/automating-pipeline.html), does four things. It:

1. Obtains a large file of English words.
2. Calculates a histogram of word lengths.
3. Generates a figure of this histogram.
4. Renders a R Markdown report in HTML.

A dependency graph for this pipeline is produced by the `makepipe` pipeline:

A plain text summary, saved to [`pipeline.md`](pipeline.md), is also produced.

## Presentation
The `presentation/` subdirectory contains slides for a brief presentation on
pipeline automation tools delivered at the Social Research Centre in November
2021.