https://github.com/Tazovsky/cachemeR

Convenient way of caching objects in R
https://github.com/Tazovsky/cachemeR

cache caching pipe r r-package r-programming

Last synced: 7 months ago
JSON representation

Convenient way of caching objects in R

Host: GitHub
URL: https://github.com/Tazovsky/cachemeR
Owner: Tazovsky
Created: 2018-05-27T21:00:04.000Z (about 7 years ago)
Default Branch: devel
Last Pushed: 2020-11-20T17:37:24.000Z (over 4 years ago)
Last Synced: 2024-08-13T07:15:05.352Z (11 months ago)
Topics: cache, caching, pipe, r, r-package, r-programming
Language: R
Homepage:
Size: 1.47 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 10
Metadata Files:
- Readme: README.Rmd

Awesome Lists containing this project

jimsghstars - Tazovsky/cachemeR - Convenient way of caching objects in R (R)

README

        ---

output: github_document

---

# cachemeR

[![Build Status](https://travis-ci.org/Tazovsky/cachemeR.svg?branch=devel)](https://travis-ci.org/Tazovsky/cachemeR)

[![codecov](https://codecov.io/gh/Tazovsky/cachemeR/branch/devel/graph/badge.svg)](https://codecov.io/gh/Tazovsky/cachemeR)

[![Coverage Status](https://coveralls.io/repos/github/Tazovsky/cachemeR/badge.svg?branch=devel)](https://coveralls.io/github/Tazovsky/cachemeR?branch=devel)

```{r setup, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = ">",

  out.width = "100%"

)

```

## Overview {#overview}

`cachemeR` is a convenient way of caching functions in R. 

From the beginning the purpose of this package is to make caching as easy as possible 

and to put as less effort as possible to implement it in existsing projects :)

## Installation {#installation}

```{r eval=FALSE}

if (!require("devtools")) 

  install.packages("devtools")

devtools::install_github("Tazovsky/cachemeR@devel")

```

## Usage - `%c-%` operator {#usage}

Cache has to be initialized. It requires to run `cachemer$new` with path to `config.yaml`. 

Then all you need to cache is to use pipe **`%c-%`** instead of assignment operator `<-`. 

In following example function `doLm` fits linear model:

```{r}

library(cachemeR)

doLm <- function(rows, cols, verbose = TRUE) {

  if (verbose)

    print("Function is run")

  set.seed(1234)

  X <- matrix(rnorm(rows*cols), rows, cols)

  b <- sample(1:cols, cols)

  y <- runif(1) + X %*% b + rnorm(rows)

  model <- lm(y ~ X)

}

# create dir where cache will be stored

dir.create(tmp.dir <- tempfile())

# create path to config.yaml - yaml fild must be in that dir

config.file <- file.path(tmp.dir, "config.yaml")

# initialize cachemeR

cache <- cachemer$new(path = config.file)

cache$setLogger(TRUE)

# cache function

result1 %c-% doLm(5, 5)

result1

# function is cached now so if you re-run function then 

# output will be retrieved from cache instead of executing 'doLm' function again

result2 %c-% doLm(5, 5)

result2

```

```{r, include=FALSE}

on.exit(unlink(dirname(config.file), T, T))

```

Operator `%c-%` is sesitive to function name, function body, argument (whether argument is named or not, 

or is list or not, or is declared in parent environment, etc.): 

```{r}

library(cachemeR)

dir.create(tmp.dir <- tempfile())

config.file <- file.path(tmp.dir, "config.yaml")

cache <- cachemer$new(path = config.file)

testFun <- function(a, b) {

  (a+b) ^ (a*b)

}

cache$setLogger(TRUE)

result1 %c-% testFun(a = 2, b = 3)

testFun <- function(a, b) {

  (a+b) / (a*b)

}

# function name didn't change, but function body did so it will be cached:

result2 %c-% testFun(a = 2, b = 3)

result1

result2

```

```{r, include=FALSE}

on.exit(unlink(dirname(config.file), T, T))

```

## Share elements across all instances of a class

```{r}

library(cachemeR)

dir.create(tmp.dir <- tempfile())

config.file <- file.path(tmp.dir, "config.yaml")

on.exit(unlink(tmp.dir, TRUE, TRUE), add = TRUE)

cache <- cachemer$new(path = config.file)

cache$setLogger(TRUE)

lm_fit <- function(rows, cols) {

  print("Fitting linear model...")

  set.seed(1234)

  X <- matrix(rnorm(rows*cols), rows, cols)

  b <- sample(1:cols, cols)

  y <- runif(1) + X %*% b + rnorm(rows)

  model <- lm(y ~ X)

}

fun3 <- function() {

  print("> fun3")

  y %c-% lm_fit(123, 123)

  return(y)

}

fun2 <- function() {

  print("> fun2")

  fun3()

}

fun1 <- function() {

  print("> fun1")

  fun2()

}

x %c-% lm_fit(123, 123) # cache function on the toppest level

x2 %c-% fun1()

identical(x, x2)

```

But it also **has some [limitations](#limitations)**.

## Use cases {#usecases}

1. shiny app example

2. calculate Fibonacci

3. ?

## Limitations {#limitations}

Generally `cachemeR` is designed to cache R functions only. And it won't work if:

- Cached function's argument value contains `$` or `[[]]`: 

```{r, eval=FALSE}

args = list(a = 1, b = 2)

res %c-% fun(a = args$a, b = args[["b"]])

```

- You want to cache something else than function only:

```{r, eval=FALSE}

res %c-% fun(a = 1, b = 2) + 1

# or expression in parenthesis:

res %c-% (fun(a = 1, b = 2) + 1)

res %c-% {fun(a = 1, b =2) + 1}

# or simple value/variable:

arg <- 1

res %c-% arg

```

- You want to use it with `magrittr` pipes, for example with `%>%`:

```{r, eval=FALSE}

getDF <- function(nm) {

  switch(nm,

         iris = iris,

         mtcars = mtcars)

}

library(dplyr)

res %c-% getDF("iris") %>% summary()

```

## Microbenchmark {#microbenchmark}

```{r}

# microbenchmark

cache <- cachemer$new(path = config.file)

cache$setLogger(FALSE)

test_no_cache <- function(n) {

  result_no_cache <- doLm(n, n, verbose = FALSE)

}

test_cache <- function(n) {

  result_no_cache %c-%  doLm(n, n, verbose = FALSE)

}

res1 <- microbenchmark::microbenchmark(

  test_no_cache(400),

  test_cache(400)

)

res1

# but now just try it assuming the calculation has been already cached

res2 <- microbenchmark::microbenchmark(

  test_no_cache(400),

  test_cache(400)

)

res2

```

# Dev environment

Package is developed in RStudio run in container:

```bash

# R 3.6.1

docker build -f Dockerfile-R3.6.1 -t kfoltynski/cachemer:3.6.1 .

# or R 4.0.0

docker build -f Dockerfile-R4.0.0 -t kfoltynski/cachemer:4.0.0 .

# run container with RStudio listening on 8790

# R 3.6.1

docker run -d --name cachemer --rm -v $(PWD):/mnt/vol -w /mnt/vol -p 8790:8787 kfoltynski/cachemer:3.6.1

# R 4.0.0

docker run -d --name cachemer --rm -v $(PWD):/mnt/vol -w /mnt/vol -p 8790:8787 kfoltynski/cachemer:4.0.0

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Tazovsky/cachemeR

Awesome Lists containing this project

README