Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/HenrikBengtsson/future.apply
:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://github.com/HenrikBengtsson/future.apply
asynchronous distributed-computing future hpc hpc-clusters package parallel parallel-computing parallel-processing parallelization programming r
Last synced: about 2 months ago
JSON representation
:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
- Host: GitHub
- URL: https://github.com/HenrikBengtsson/future.apply
- Owner: HenrikBengtsson
- Created: 2017-08-31T00:28:31.000Z (about 7 years ago)
- Default Branch: develop
- Last Pushed: 2024-01-13T21:47:15.000Z (8 months ago)
- Last Synced: 2024-01-30T11:14:43.462Z (8 months ago)
- Topics: asynchronous, distributed-computing, future, hpc, hpc-clusters, package, parallel, parallel-computing, parallel-processing, parallelization, programming, r
- Language: R
- Homepage: https://future.apply.futureverse.org
- Size: 949 KB
- Stars: 203
- Watchers: 11
- Forks: 16
- Open Issues: 28
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
- jimsghstars - HenrikBengtsson/future.apply - :rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures (R)
README
# future.apply: Apply Function to Elements in Parallel using Futures
## Introduction
The purpose of this package is to provide worry-free parallel alternatives to base-R "apply" functions, e.g. `apply()`, `lapply()`, and `vapply()`. The goal is that one should be able to replace any of these in the core with its futurized equivalent and things will just work. For example, instead of doing:
```r
library(datasets)
library(stats)
y <- lapply(mtcars, FUN = mean, trim = 0.10)
```
one can do:
```r
library(future.apply)
plan(multisession) ## Run in parallel on local computerlibrary(datasets)
library(stats)
y <- future_lapply(mtcars, FUN = mean, trim = 0.10)
```Reproducibility is part of the core design, which means that perfect, parallel random number generation (RNG) is supported regardless of the amount of chunking, type of load balancing, and future backend being used. To enable parallel RNG, use argument `future.seed = TRUE`.
## Role
Where does the **[future.apply]** package fit in the software stack? You can think of it as a sibling to **[foreach]**, **[furrr]**, **[BiocParallel]**, **[plyr]**, etc. Just as **parallel** provides `parLapply()`, **foreach** provides `foreach()`, **BiocParallel** provides `bplapply()`, and **plyr** provides `llply()`, **future.apply** provides `future_lapply()`. Below is a table summarizing this idea:
Package
Functions
BackendsFuture-versions of common goto
*apply()
functions available in base R (of the base package):
future_apply()
,future_by()
,future_eapply()
,future_lapply()
,future_Map()
,future_mapply()
,future_.mapply()
,future_replicate()
,future_sapply()
,future_tapply()
, andfuture_vapply()
.
The following function is not implemented:
future_rapply()
All future backends
parallel
mclapply()
,mcmapply()
,clusterMap()
,parApply()
,parLapply()
,parSapply()
, ...Built-in and conditional on operating system
foreach()
,times()
All future backends via doFuture
future_imap()
,future_map()
,future_pmap()
,future_map2()
,
...All future backends
Bioconductor's parallel mappers:
bpaggregate()
,bpiterate()
,bplapply()
, andbpvec()
All future backends via doFuture (because it supports foreach) or via BiocParallel.FutureParam (direct BiocParallelParam support; prototype)
**ply(..., .parallel = TRUE)
functions:
aaply()
,ddply()
,dlply()
,llply()
, ...All future backends via doFuture (because it uses foreach internally)
Note that, except for the built-in **parallel** package, none of these higher-level APIs implement their own parallel backends, but they rather enhance existing ones. The **foreach** framework leverages backends such as **[doParallel]**, **[doMC]** and **[doFuture]**, and the **future.apply** framework leverages the **[future]** ecosystem and therefore backends such as built-in **parallel**, **[future.callr]**, and **[future.batchtools]**.
By separating `future_lapply()` and friends from the **[future]** package, it helps clarifying the purpose of the **future** package, which is to define and provide the core Future API, which higher-level parallel APIs can build on and for which any futurized parallel backends can be plugged into.
The API and identity of the **future.apply** package will be kept close to the `*apply()` functions in base R. In other words, it will _neither_ keep growing nor be expanded with new, more powerful apply-like functions beyond those core ones in base R. Such extended functionality should be part of a separate package.
[batchtools]: https://cran.r-project.org/package=batchtools
[BiocParallel]: https://bioconductor.org/packages/BiocParallel/
[doFuture]: https://cran.r-project.org/package=doFuture
[doMC]: https://cran.r-project.org/package=doMC
[doParallel]: https://cran.r-project.org/package=doParallel
[foreach]: https://cran.r-project.org/package=foreach
[future]: https://cran.r-project.org/package=future
[future.apply]: https://cran.r-project.org/package=future.apply
[future.batchtools]: https://cran.r-project.org/package=future.batchtools
[future.callr]: https://cran.r-project.org/package=future.callr
[furrr]: https://cran.r-project.org/package=furrr
[plyr]: https://cran.r-project.org/package=plyr## Installation
R package future.apply is available on [CRAN](https://cran.r-project.org/package=future.apply) and can be installed in R as:
```r
install.packages("future.apply")
```### Pre-release version
To install the pre-release version that is available in Git branch `develop` on GitHub, use:
```r
remotes::install_github("HenrikBengtsson/future.apply", ref="develop")
```
This will install the package from source.## Contributing
To contribute to this package, please see [CONTRIBUTING.md](CONTRIBUTING.md).