https://github.com/simonrolph/minimal_targets_demo

A minimal targets demonstrator
https://github.com/simonrolph/minimal_targets_demo

Last synced: about 2 months ago
JSON representation

A minimal targets demonstrator

Host: GitHub
URL: https://github.com/simonrolph/minimal_targets_demo
Owner: simonrolph
Created: 2024-02-15T16:27:47.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-02-22T17:06:51.000Z (over 1 year ago)
Last Synced: 2025-04-09T22:58:27.637Z (about 2 months ago)
Language: R
Size: 2.02 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # A very minimal {targets} demonstration and exercises

## Overview

This is an introduction to {targets}. The targets package is a Make-like pipeline tool for statistics and data science in R. https://books.ropensci.org/targets/ 

The philosophy is for this demonstration is to be **as simple as possible**, and focus on the key aspects of why making pipelines is a good idea.

Unlike most pipeline tools, which are language agnostic or Python-focused, the targets package allows data scientists and researchers to work entirely within R. targets implicitly nudges users toward a clean, function-oriented programming style that fits the intent of the R language and helps practitioners maintain their data analysis projects.

There are some slides in `/slides` which I use for introducing the concepts.

## Demonstration - getting to grips with {targets}

In order to demonstrate {targets} functionality, I show how a basic R script can be written as a {targets} pipeline. This pipeline takes a csv of two numeric variables, `x` and `y` and produces a plot with a linear model prediction.

The first script `0_generate_test_data.R` sets up some example data to use in the demonstration. It creates an `inputs` and `outputs` folders (which are ignored by git).

I first walk through how you might write this script with a "business as usual" approach, no {targets}. This is demonstrated in `1_non_targets_version.R` and does all the things you'd expect how it might work if you didn't use targets.

I then use `2_targets_version.R` as a reference for myself but don't show it to participants. Encourge users to create their own `_targets.R` file and create the pipeline step by step. These are roughly the steps I walk through to demonstrate the usage of {targets}:

 - Create a `_targets.R` file

 - Add `library(targets)`

 - Add `list()` to the end of the file

 - Add `tar_target(data_file, "inputs/data.csv",format = "file")` to make a 1 node pipeline, explain the importance of `format="file"`

 - Then do `tar_visnetwork()` then `tar_make()` then `tar_visnetwork()` again to show how it updates the graph

 - Add `tar_target(data, read.csv(data_file))`

 - Define a function to model the data and add `tar_target(data_model, model_data(data))`. Do `tar_visnetwork()` to show how functions appear.

 - Explain how the tracking works, take a look in the `targets` folder and show `tar_read()`

 - Add the rest of the pipeline, including explaining how outputs (noting how the function returns file path not R object), highlight use of `format="file"` again.

 - Get participants to ask questions and modify pipeline in response to demosntrate your answer

## Extra bits

`3_static_branching.R` shows how you can do static branching by subsetting the iris dataset by species and producing a species for each plot.

Other things you might want to cover:

 - How to put your R functions in separate files (eg. `R/functions.R`) and source them from `_targets.R` to help you organise your code.

 - How to have multiple targets dependent on the same target

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/simonrolph/minimal_targets_demo

Awesome Lists containing this project

README