https://github.com/elilillyco/rfacts
Call FACTS from R on Linux
https://github.com/elilillyco/rfacts
clinical-trials facts r simulation
Last synced: 9 months ago
JSON representation
Call FACTS from R on Linux
- Host: GitHub
- URL: https://github.com/elilillyco/rfacts
- Owner: EliLillyCo
- License: other
- Created: 2020-07-01T02:20:29.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2022-08-19T13:02:00.000Z (over 3 years ago)
- Last Synced: 2025-04-15T22:56:37.055Z (9 months ago)
- Topics: clinical-trials, facts, r, simulation
- Language: R
- Homepage: https://EliLillyCo.github.io/rfacts
- Size: 378 KB
- Stars: 7
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r setup, include = FALSE}
suppressPackageStartupMessages(library(dplyr))
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# rfacts
[](https://cran.r-project.org/package=rfacts)
[](https://www.repostatus.org/#active)
[](https://github.com/EliLillyCo/rfacts/actions?query=workflow%3Acheck)
[](https://github.com/EliLillyCo/rfacts/actions?query=workflow%3Alint)
The rfacts package is an R interface to the [Fixed and Adaptive Clinical Trial Simulator (FACTS)](https://www.berryconsultants.com/software/) on Unix-like systems. It programmatically invokes [FACTS](https://www.berryconsultants.com/software/) to run clinical trial simulations, and it aggregates simulation output data into tidy data frames. These capabilities provide end-to-end automation for large-scale simulation workflows, and they enhance computational reproducibility. For more information, please visit the [documentation website](https://elilillyco.github.io/rfacts/).
## Disclaimer
`rfacts` is not a product of nor supported by [Berry Consultants](https://www.berryconsultants.com/). The code base of `rfacts` is completely independent from that of [FACTS](https://www.berryconsultants.com/software/), and the former only invokes the latter though dynamic system calls.
## Limitations
* FACTS files prior to version 6.2.4 are unsupported.
* `rfacts` only works on Unix-like systems.
* `rfacts` requires paths to pre-compiled versions of Mono, FLFLL, and the FACTS Linux engines. See the installation instructions below and the [configuration guide](https://elilillyco.github.io/rfacts/articles/config.html).
## Installation
To install the latest release from CRAN, open R and run the following.
```{r, eval = FALSE}
install.packages("rfacts")
```
To install the latest development version:
```{r, eval = FALSE}
install.packages("remotes")
remotes::install_github("EliLillyCo/rfacts")
```
Next, set the `RFACTS_PATHS` environment variable appropriately. For instructions, please see the [configuration guide](https://elilillyco.github.io/rfacts/articles/config.html).
## Run FACTS simulations
First, create a `*.facts` XML file using the [FACTS](https://www.berryconsultants.com/software/) GUI. The `rfacts` package has several built-in examples, included with permission from Berry Consultants LLC.
```{r}
library(rfacts)
# get_facts_file_example() returns the path to
# an example a FACTS file from rfacts itself.
# For your own FACTS files you create yourself in the FACTS GUI,
# you can skip get_facts_file_example().
facts_file <- get_facts_file_example("contin.facts")
basename(facts_file)
```
Then, run trial simulations with `run_facts()`. By default, the results are written to a temporary directory. Set the `output_path` argument to customize the path.
```{r}
out <- run_facts(
facts_file,
n_sims = 2,
verbose = FALSE
)
out
head(get_csv_files(out))
```
Use `read_patients()` to read and aggregate all the `patients*.csv` files. `rfacts` has several such functions, including `read_weeks()` and `read_mcmc()`.
```{r}
read_patients(out)
```
## The simulation process
`run_facts()` has two sequential stages:
1. `run_flfll()`: generate the `*.param` files and the folder structure for the FACTS Linux engines.
2. `run_engine()`: execute the instructions in the `*.param` files to conduct trial simulations and produce CSV output.
```{r}
out <- run_flfll(facts_file, verbose = FALSE)
run_engine(facts_file, param_files = out, n_sims = 4, verbose = FALSE)
read_patients(out)
```
`run_engine()` automatically detects the Linux engine required for your FACTS file. If you know the engine in advance, you can use a specific engine function such as `run_engine_contin()` or `run_engine_dichot()`.
```{r}
out <- run_flfll(facts_file, verbose = FALSE)
run_engine_contin(param_files = out, n_sims = 4, verbose = FALSE)
read_patients(out)
```
If you are unsure which engine function to use, call `get_facts_engine()`
```{r}
get_facts_engine(facts_file)
```
## Run a single scenario
If we take control of the simulation process, we can pick and choose which FACTS simulation scenarios to run and read.
```{r}
# Example FACTS file built into rfacts.
facts_file <- get_facts_file_example("contin.facts")
# Set up the files for the scenarios.
param_files <- run_flfll(facts_file, verbose = FALSE)
# Each scenario has its own folder with internal parameter files.
scenarios <- get_param_dirs(param_files) # not in rfacts <= 1.0.0
scenarios
# Let's pick one of those scenarios and run the simulations.
scenario <- scenarios[1]
run_engine_contin(scenario, n_sims = 2, verbose = FALSE)
read_patients(scenario)
```
## Parallel computing
rfacts makes it straightforward to parallelize across simulations. First, use `run_flfll()` to create a directory of param files. The example below uses a `tempfile()` to store the param files (i.e. `output_path`). However, for distributed computing on traditional HPC clusters, `output_path` should be a directory path that all nodes can access.
```{r}
library(rfacts)
facts_file <- get_facts_file_example("contin.facts")
# On traditional HPC clusters, this should be a shared directory
# instead of a temp directory:
tmp <- fs::dir_create(tempfile())
param_files <- file.path(tmp, "param_files")
run_flfll(facts_file, param_files)
```
Next, write a custom function that accepts the param files, runs a single simulation for each param file, and returns the important data in memory. Be sure to set a unique seed for each simulation iteration.
```{r}
sim_once <- function(iter, param_files) {
# Copy param files to a temp file in order to
# (1) Avoid race conditions in parallel processing, and
# (2) Make things run faster: temp files are on local node storage.
out <- tempfile()
fs::dir_copy(path = param_files, new_path = out)
# Run the engine once per param file.
run_engine_contin(out, n_sims = 1L, seed = iter)
# Return aggregated patients files.
read_patients(out) # Reads fast because `out` is a tempfile().
}
```
At this point, we should test this function locally without parallel computing.
```{r}
library(dplyr)
# All the patients files were named patients00001.csv,
# so do not trust the facts_sim column.
# For data post-processing, use the facts_id column instead.
lapply(seq_len(4), sim_once, param_files = param_files) %>%
bind_rows()
```
Parallel computing happens when we call `sim_once()` repeatedly over several parallel workers. A powerful and convenient parallel computing solution is [`clustermq`](https://mschubert.github.io/clustermq/). Here is a sketch of how to use it with `rfacts`. `mclapply()` from the `parallel` package is a quick and dirty alternative.
```{r, eval = FALSE}
# Configure clustermq to use our grid and your template file.
# If you are using a scheduler like SGE, you need to write a template file
# like clustermq.tmpl. To learn how, visit
# https://mschubert.github.io/clustermq/articles/userguide.html#configuration-1
options(clustermq.scheduler = "sge", clustermq.template = "clustermq.tmpl")
# Run the computation.
library(clustermq)
patients <- Q(
fun = sim_once,
iter = seq_len(50),
const = list(params = params),
pkgs = c("fs", "rfacts"),
n_jobs = 4
) %>%
bind_rows()
# Show aggregated patient data.
patients
```
Alternatives to `clustermq` include `parallel::mclapply()`, `furrr::future_map()`, and `future.apply::future_lapply()`.
## Helpers
Various `get_facts_*()` functions interrogate FACTS files.
```{r}
get_facts_scenarios(facts_file)
get_facts_version(facts_file)
get_facts_versions()
```