Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/henrikbengtsson/future.callr

:rocket: R package future.callr: A Future API for Parallel Processing using 'callr'
https://github.com/henrikbengtsson/future.callr

asynchronous futures parallel-computing parallel-processing parallelization programming promises r

Last synced: 7 days ago
JSON representation

:rocket: R package future.callr: A Future API for Parallel Processing using 'callr'

Awesome Lists containing this project

README

        


CRAN check status R CMD check status future.tests checks status Coverage Status

# future.callr: A Future API for Parallel Processing using 'callr'

## Introduction

The **[future]** package provides a generic API for using futures in
R. A future is a simple yet powerful mechanism to evaluate an R
expression and retrieve its value at some point in time. Futures can
be resolved in many different ways depending on which strategy is
used. There are various types of synchronous and asynchronous futures
to choose from in the **[future]** package.

This package, **[future.callr]**, provides a type of futures that
utilizes the **[callr]** package.

For example,

```r
> library("future.callr")
> plan(callr)
>
> x %<-% { Sys.sleep(5); 3.14 }
> y %<-% { Sys.sleep(5); 2.71 }
> x + y
[1] 5.85
```

This is obviously a toy example to illustrate what futures look like
and how to work with them. For further examples on how to use
futures, see the vignettes of the **[future]** package as well as
those of **[future.apply]** and **[doFuture]**.

## Using the callr backend

The **future.callr** package implements a **future** backend wrapper
for **callr**.

| Backend | Description | Alternative in future package
|:--------|:-----------------------------------------------------------------|:------------------------------
| `callr` | parallel evaluation in a separate R process (on current machine) | `plan(multisession)`

### Each callr future uses a fresh R session

When using `callr` futures, each future is resolved in a fresh
background R session which ends as soon as the value of the future has
been collected. In contrast, `multisession` futures are resolved in
background R worker sessions that serve multiple futures over their
life spans. The advantage of using a new R process for each future is
that the R environment is guaranteed not to be contaminated by
previous futures, e.g. memory allocations, finalizers, modified
options, and loaded and attached packages. The disadvantage, is an
added overhead of launching a new R process. (At the moment, I am
neither aware of formal benchmarking of this extra overhead nor of
performance comparisons of `callr` to alternative future backends.)

### More than 125 parallel callr futures

Another advantage with `callr` futures compared to `multisession`
futures is that they do not communicate via R (socket) connections.
This avoids the limitation in the number of parallel futures that can
be active at any time that `multisession` futures and `cluster`
futures in general have, which they inherit from `SOCKcluster`
clusters as defined by the **parallel** package. The number of
parallel futures these can serve is limited by the [maximum number of
open connections in
R](https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28), which
currently is 125 (excluding the three reserved by R itself). Note
that these 125 slots have to be shared with file connections etc. To
increase this limit, R has to be rebuilt from source. However, since
`callr` futures rely on [the callr package which does not make use of
R-specific connections](https://github.com/r-lib/processx/issues/91),
there is no limit in the number of background R processes that can be
used simultaneously.

### No ports are used - no port clashes or firewall issues

A third advantage with `callr` futures, is that there is not risk for
port-clashing with other processes on the system when clusters are set
up (*), because **callr** does not rely on ports. Furthermore, on
Windows, the firewall triggers an alert that the user needs to approve
whenever a not-previously-approved port is requested by R - [which
happens also for local, non-public
ports](https://stackoverflow.com/questions/47353848/localhost-connection-without-firewall-popup/47542866)
that are used by `SOCKcluster`:s. When using `callr` futures, no
sockets and therefore no ports are involved.

(*) To lower the risk for such clashes `SOCKcluster`:s (of the
**parallel** package) request random ports, but clashes still occur at
times.

## Demos

The **[future]** package provides a demo using futures for calculating
a set of Mandelbrot planes. The demo does not assume anything about
what type of futures are used. _The user has full control of how
futures are evaluated_. For instance, to use `callr` futures, run the
demo as:

```r
library("future.callr")
plan(callr)
demo("mandelbrot", package = "future", ask = FALSE)
```

[callr]: https://cran.r-project.org/package=callr
[future]: https://cran.r-project.org/package=future
[future.callr]: https://cran.r-project.org/package=future.callr
[future.apply]: https://cran.r-project.org/package=future.apply
[doFuture]: https://cran.r-project.org/package=doFuture

## Installation
R package future.callr is available on [CRAN](https://cran.r-project.org/package=future.callr) and can be installed in R as:
```r
install.packages("future.callr")
```

### Pre-release version

To install the pre-release version that is available in Git branch `develop` on GitHub, use:
```r
remotes::install_github("HenrikBengtsson/future.callr", ref="develop")
```
This will install the package from source.

## Contributing

To contribute to this package, please see [CONTRIBUTING.md](CONTRIBUTING.md).