Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/leeper/make-example
An example of using make for a data analysis project
https://github.com/leeper/make-example
data-analysis make manuscript reproducible-research
Last synced: 16 days ago
JSON representation
An example of using make for a data analysis project
- Host: GitHub
- URL: https://github.com/leeper/make-example
- Owner: leeper
- License: mit
- Created: 2016-09-21T14:32:18.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2019-05-06T11:30:29.000Z (over 5 years ago)
- Last Synced: 2024-10-13T19:09:22.441Z (about 1 month ago)
- Topics: data-analysis, make, manuscript, reproducible-research
- Language: R
- Size: 27.3 KB
- Stars: 32
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md
Awesome Lists containing this project
- jimsghstars - leeper/make-example - An example of using make for a data analysis project (R)
README
# Example of `make` for data analysis
This repository shows how to use `make` for a data analysis project.
The paper (PDF) is generated from a LaTeX source file, a table, and a figure.
`make` uses a set of instructions in `makefile` to generate the table and figure using R code and a datafile, and then generate the PDF from the manuscript.
`make` is clever because it deconstructs each part of the analysis so that only parts that have changed need to be rerun. If the data change, everything is rerun. If figure-generating code changes, only that code and the manuscript are rerun. If only the manuscript changes, only `pdflatex` is rerun. It's smart like that.
Basically it works on a directed acyclic graph (DAG) model, represented by this network graph:
```{r setup, include=FALSE}
library("svglite")
knitr::opts_chunk$set(
dev = "svglite",
fig.path="fig-",
fig.ext = "svg"
)
``````{r dag, echo = FALSE, fig.height=4, fig.width=6}
requireNamespace("igraph", quietly = TRUE)
requireNamespace("ggraph", quietly = TRUE)parse_makefile <-
function(
file = "makefile",
...
) {
con <- file(file, open = "rb")
on.exit(close(con))
m <- character()
# handle continuation characters
while (length(con)) {
this_line <- readLines(con, n = 1L)
if (!length(this_line)) {
break
}
m[length(m) + 1] <- this_line
while (grepl("\\\\$", m[length(m)]) && length(con)) {
m[length(m)] <- sub("\\\\$", " ", m[length(m)])
m[length(m)] <- paste0(m[length(m)], readLines(con, n = 1L))
}
}
return(m)
}makefile_to_network <-
function(
lines = parse_makefile(),
exclude = NULL,
...
) {
rules <- lines[grepl("[ A-Za-z0-9./]+ ?[:] ?[ A-Za-z0-9./]+", lines)]
rules_list <- strsplit(rules, " ?: ?")
edges <- lapply(rules_list, function(x) {
# cleanup target
target <- x[1L]
if (target %in% exclude) {
return(NULL)
}
## handle targets with multiple outputs
targets <- strsplit(target, " ")[[1L]]
# cleanup prereqs
prereqs <- strsplit(x[2L], "[ \t]+")[[1L]]
prereqs <- prereqs[!is.na(prereqs)]
# return
if (!length(prereqs) || all(prereqs == "")) {
return(NULL)
}
as.vector(t(as.matrix((expand.grid(prereqs, targets, stringsAsFactors = FALSE)))))
})
igraph::make_graph(unlist(edges))
}net <- makefile_to_network(parse_makefile("makefile"), exclude = "all")
ggraph::ggraph(net, layout = "kk") +
ggraph::geom_edge_link(ggplot2::aes(start_cap = ggraph::label_rect(node1.name),
end_cap = ggraph::label_rect(node2.name)),
arrow = grid::arrow(length = grid::unit(2, 'mm'))) +
ggraph::geom_node_text(ggplot2::aes(label = name), size = 4) +
ggplot2::theme_void()
```The R file `analysis.R` shows what is going on in `makefile` using possibly more familiar R syntax. The `README.Rmd` file contains the code to construct the above graph from an arbitrary makefile.
Zach Jones has [a good tutorial](http://zmjones.com/make/) about all of this.