https://github.com/kylebaron/mrgsim.parallel

Parallel simulation with mrgsolve and futures
https://github.com/kylebaron/mrgsim.parallel

future mrgsolve parallelization

Last synced: 4 months ago
JSON representation

Parallel simulation with mrgsolve and futures

Host: GitHub
URL: https://github.com/kylebaron/mrgsim.parallel
Owner: kylebaron
Created: 2019-09-27T16:04:37.000Z (over 6 years ago)
Default Branch: main
Last Pushed: 2025-09-10T17:50:51.000Z (6 months ago)
Last Synced: 2025-10-22T03:55:29.099Z (5 months ago)
Topics: future, mrgsolve, parallelization
Language: R
Homepage: https://kylebaron.github.io/mrgsim.parallel
Size: 658 KB
Stars: 5
Watchers: 4
Forks: 1
Open Issues: 4
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md

Awesome Lists containing this project

README

          ---

title: ""

output: github_document

---

```{r,setup,include=FALSE}

knitr::opts_chunk$set(comment = '.', message=FALSE, warning = FALSE, 

                      fig.path="man/figures/README-")

```

# mrgsim.parallel

## Overview

mrgsolve.parallel facilitates parallel simulation with 

mrgsolve in R.  The future and parallel packages provide the parallelization.  

There are 2 main workflows:

1. Split a `data_set` into chunks by ID, simulate the chunks in parallel, then 

   assemble the results back to a single data frame.

1. Split an `idata_set` (individual-level parameters) into chunks by row, 

   simulate the chunks in parallel, then assemble the results back to a single

   data frame.

The nature of the parallel backend requires some overhead to get the 

parallel simulation done.  So, it will take a reasonably-sized job to see 

a speed increase and small jobs will likely take *longer* with parallelization.

But jobs taking more than a handful of seconds could benefit from this type 

of parallelization.

```{r,include = FALSE}

options(mrgsolve.soloc = "build")

```

## Backend 

```{r}

library(dplyr)

library(future)

library(mrgsim.parallel)

options(future.fork.enable = TRUE, parallelly.fork.enable = TRUE, mc.cores = 4L)

```

## First workflow: split and simulate a data set

```{r}

mod <- modlib("pk2cmt", end = 168*8, delta = 1)

data <- expand.ev(amt = 100*seq(1,2000), ii = 24, addl = 27*2+2) 

data <- mutate(data, CL = runif(n(), 0.7, 1.3))

head(data)

dim(data)

```

We can simulate in parallel with the future package or the parallel package like this:

```{r}

plan(multisession, workers = 4L)

system.time(ans1 <- future_mrgsim_d(mod, data, nchunk = 4L))

plan(multicore, workers = 4L)

system.time(ans1b <- future_mrgsim_d(mod, data, nchunk = 4L))

system.time(ans2 <- mc_mrgsim_d(mod, data, nchunk = 4L))

```

To compare an identical simulation done without parallelization

```{r}

system.time(ans3 <- mrgsim_d(mod,data))

```

```{r}

identical(ans2,as.data.frame(ans3))

```

## Second workflow: split and simulate a batch of parameters

Backend and the model

```{r}

plan(multisession, workers = 6)

mod <- modlib("pk1cmt", end = 168*4, delta = 1)

```

For this workflow, we have a set of parameters (`idata`) along with an 

event object that gets applied to all of the parameters

```{r}

idata <- tibble(CL = runif(4000, 0.5, 1.5), ID = seq_along(CL))

head(idata)

```

```{r}

dose <- ev(amt = 100, ii = 24, addl = 27)

dose

```

Run it in parallel

```{r}

system.time(ans1 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6))

```

And without parallelization

```{r}

system.time(ans2 <- mrgsim_ei(mod, dose, idata, output = "df"))

identical(ans1,ans2)

```

## Utility functions 

You can access the chunking functions for your own parallel workflows

```{r}

dose <- ev_seq(ev(amt = 100), ev(amt = 50, ii = 12, addl = 2))

dose <- ev_rep(dose, 1:5)

dose

chunk_by_id(dose, nchunk = 2)

```

See also: `chunk_by_row`

## Do a dry run to check the overhead of parallelization

```{r}

plan(transparent)

system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))

plan(multisession, workers = 8L)

system.time(x <- fu_mrgsim_d(mod, data, nchunk = 8, .dry = TRUE))

```

## Pass a function to post process on the worker

First check the range of times from the previous example

```{r}

summary(ans1$time)

```

The post-processing function has arguments the simulated data and the 

model object

```{r}

post <- function(sims, mod) {

  filter(sims, time > 600)  

}

dose <- ev(amt = 100, ii = 24, addl = 27)

ans3 <- mc_mrgsim_ei(mod, dose, idata, nchunk = 6, .p = post)

```

```{r}

summary(ans3$time)

```

The main use case here is to summarize or some how decrease the volume of data

before returning the combined simulations.  In case memory is able to handle

the simulation volume, this post-processing could be done on the combined

data as well.



## More info

See [inst/docs/stories.md (on GitHub only)](inst/docs/stories.md) for more details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kylebaron/mrgsim.parallel

Awesome Lists containing this project

README