Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mlr-org/rush
Parallel and distributed computing in R.
https://github.com/mlr-org/rush
mlr3 parallel-computing r
Last synced: 6 days ago
JSON representation
Parallel and distributed computing in R.
- Host: GitHub
- URL: https://github.com/mlr-org/rush
- Owner: mlr-org
- Created: 2023-08-10T12:17:33.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-28T16:13:00.000Z (7 months ago)
- Last Synced: 2024-05-01T09:37:23.024Z (7 months ago)
- Topics: mlr3, parallel-computing, r
- Language: R
- Homepage: https://rush.mlr-org.com/
- Size: 8.75 MB
- Stars: 4
- Watchers: 5
- Forks: 0
- Open Issues: 8
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
output: github_document
---# rush
*rush* is a package for parallel and distributed computing in R.
It evaluates an R expression asynchronously on a cluster of workers and provides a shared storage between the workers.
The shared storage is a [Redis](https://redis.io) data base.
Rush offers a centralized and decentralized network architecture.
The centralized network has a single controller (`Rush`) and multiple workers (`RushWorker`).
Tasks are created centrally and distributed to workers by the controller.
The decentralized network has no controller.
The workers sample tasks and communicate the results asynchronously with other workers.# Features
* Parallelize arbitrary R expressions.
* Centralized and decentralized network architecture.
* Small overhead of a few milliseconds per task.
* Easy start of local workers with `processx`
* Start workers on any platform with a batch script.
* Designed to work with [`data.table`](https://CRAN.R-project.org/package=data.table).
* Results are cached in the R session to minimize read and write operations.
* Detect and recover from worker failures.
* Start heartbeats to monitor workers on remote machines.
* Snapshot the in-memory data base to disk.
* Store [`lgr`](https://CRAN.R-project.org/package=lgr) messages of the workers in the Redis data base.
* Light on dependencies.## Install
Install the development version from GitHub.
```{r eval = FALSE}
remotes::install_github("mlr-org/rush")
```And install [Redis](https://redis.io/docs/latest/operate/oss_and_stack/install/install-stack/).
## Centralized Rush Network
![](man/figures/README-flow.png)
*Centralized network with a single controller and three workers.*
```{r, include=FALSE}
config = redux::redis_config()
r = redux::hiredis(config)
r$FLUSHDB()
```The example below shows the evaluation of a simple function in a centralized network.
The `network_id` identifies the instance and workers in the network.
The `config` is a list of parameters for the connection to Redis.```{r}
library(rush)config = redux::redis_config()
rush = Rush$new(network_id = "test", config)rush
```Next, we define a function that we want to evaluate on the workers.
```{r}
fun = function(x1, x2, ...) {
list(y = x1 + x2)
}
```We start two workers.
```{r}
rush$start_local_workers(fun = fun, n_workers = 2)
```Now we can push tasks to the workers.
```{r}
xss = list(list(x1 = 3, x2 = 5), list(x1 = 4, x2 = 6))
keys = rush$push_tasks(xss)
rush$wait_for_tasks(keys)
```And retrieve the results.
```{r}
rush$fetch_finished_tasks()
```## Decentralized Rush Network
![](man/figures/README-flow-2.png)
*Decentralized network with four workers.*