{"id":26774463,"url":"https://github.com/elilillyco/rfacts","last_synced_at":"2025-04-15T22:56:42.340Z","repository":{"id":55985388,"uuid":"276255946","full_name":"EliLillyCo/rfacts","owner":"EliLillyCo","description":"Call FACTS from R on Linux","archived":false,"fork":false,"pushed_at":"2022-08-19T13:02:00.000Z","size":387,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-15T22:56:37.055Z","etag":null,"topics":["clinical-trials","facts","r","simulation"],"latest_commit_sha":null,"homepage":"https://EliLillyCo.github.io/rfacts","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EliLillyCo.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-07-01T02:20:29.000Z","updated_at":"2025-01-10T14:08:03.000Z","dependencies_parsed_at":"2022-08-15T10:50:32.650Z","dependency_job_id":null,"html_url":"https://github.com/EliLillyCo/rfacts","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EliLillyCo%2Frfacts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EliLillyCo%2Frfacts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EliLillyCo%2Frfacts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EliLillyCo%2Frfacts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EliLillyCo","download_url":"https://codeload.github.com/EliLillyCo/rfacts/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249167439,"owners_count":21223505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clinical-trials","facts","r","simulation"],"created_at":"2025-03-29T02:16:27.519Z","updated_at":"2025-04-15T22:56:42.314Z","avatar_url":"https://github.com/EliLillyCo.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r setup, include = FALSE}\nsuppressPackageStartupMessages(library(dplyr))\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# rfacts\n\n[![cran](https://www.r-pkg.org/badges/version/rfacts)](https://cran.r-project.org/package=rfacts)\n[![active](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n[![check](https://github.com/EliLillyCo/rfacts/workflows/check/badge.svg)](https://github.com/EliLillyCo/rfacts/actions?query=workflow%3Acheck)\n[![lint](https://github.com/EliLillyCo/rfacts/workflows/lint/badge.svg)](https://github.com/EliLillyCo/rfacts/actions?query=workflow%3Alint)\n\nThe rfacts package is an R interface to the [Fixed and Adaptive Clinical Trial Simulator (FACTS)](https://www.berryconsultants.com/software/) on Unix-like systems. It programmatically invokes [FACTS](https://www.berryconsultants.com/software/) to run clinical trial simulations, and it aggregates simulation output data into tidy data frames. These capabilities provide end-to-end automation for large-scale simulation workflows, and they enhance computational reproducibility. For more information, please visit the [documentation website](https://elilillyco.github.io/rfacts/).\n\n## Disclaimer\n\n`rfacts` is not a product of nor supported by [Berry Consultants](https://www.berryconsultants.com/). The code base of `rfacts` is completely independent from that of [FACTS](https://www.berryconsultants.com/software/), and the former only invokes the latter though dynamic system calls.\n\n## Limitations\n\n* FACTS files prior to version 6.2.4 are unsupported.\n* `rfacts` only works on Unix-like systems.\n* `rfacts` requires paths to pre-compiled versions of Mono, FLFLL, and the FACTS Linux engines. See the installation instructions below and the [configuration guide](https://elilillyco.github.io/rfacts/articles/config.html).\n\n## Installation\n\nTo install the latest release from CRAN, open R and run the following.\n\n```{r, eval = FALSE}\ninstall.packages(\"rfacts\")\n```\n\nTo install the latest development version:\n\n```{r, eval = FALSE}\ninstall.packages(\"remotes\")\nremotes::install_github(\"EliLillyCo/rfacts\")\n```\n\nNext, set the `RFACTS_PATHS` environment variable appropriately. For instructions, please see the [configuration guide](https://elilillyco.github.io/rfacts/articles/config.html).\n\n## Run FACTS simulations\n\nFirst, create a `*.facts` XML file using the [FACTS](https://www.berryconsultants.com/software/) GUI. The `rfacts` package has several built-in examples, included with permission from Berry Consultants LLC.\n\n```{r}\nlibrary(rfacts)\n\n# get_facts_file_example() returns the path to\n# an example a FACTS file from rfacts itself.\n# For your own FACTS files you create yourself in the FACTS GUI,\n# you can skip get_facts_file_example().\nfacts_file \u003c- get_facts_file_example(\"contin.facts\")\n\nbasename(facts_file)\n```\n\nThen, run trial simulations with `run_facts()`. By default, the results are written to a temporary directory. Set the `output_path` argument to customize the path.\n\n```{r}\nout \u003c- run_facts(\n  facts_file,\n  n_sims = 2,\n  verbose = FALSE\n)\n\nout\n\nhead(get_csv_files(out))\n```\n\nUse `read_patients()` to read and aggregate all the `patients*.csv` files. `rfacts` has several such functions, including `read_weeks()` and `read_mcmc()`.\n\n```{r}\nread_patients(out)\n```\n\n## The simulation process\n\n`run_facts()` has two sequential stages:\n\n1. `run_flfll()`: generate the `*.param` files and the folder structure for the FACTS Linux engines.\n2. `run_engine()`: execute the instructions in the `*.param` files to conduct trial simulations and produce CSV output.\n\n```{r}\nout \u003c- run_flfll(facts_file, verbose = FALSE)\nrun_engine(facts_file, param_files = out, n_sims = 4, verbose = FALSE)\nread_patients(out)\n```\n\n`run_engine()` automatically detects the Linux engine required for your FACTS file. If you know the engine in advance, you can use a specific engine function such as `run_engine_contin()` or `run_engine_dichot()`.\n\n```{r}\nout \u003c- run_flfll(facts_file, verbose = FALSE)\nrun_engine_contin(param_files = out, n_sims = 4, verbose = FALSE)\nread_patients(out)\n```\n\nIf you are unsure which engine function to use, call `get_facts_engine()`\n\n```{r}\nget_facts_engine(facts_file)\n```\n\n## Run a single scenario\n\nIf we take control of the simulation process, we can pick and choose which FACTS simulation scenarios to run and read.\n\n```{r}\n# Example FACTS file built into rfacts.\nfacts_file \u003c- get_facts_file_example(\"contin.facts\")\n\n# Set up the files for the scenarios.\nparam_files \u003c- run_flfll(facts_file, verbose = FALSE)\n\n# Each scenario has its own folder with internal parameter files.\nscenarios \u003c- get_param_dirs(param_files) # not in rfacts \u003c= 1.0.0\nscenarios\n\n# Let's pick one of those scenarios and run the simulations.\nscenario \u003c- scenarios[1]\nrun_engine_contin(scenario, n_sims = 2, verbose = FALSE)\nread_patients(scenario)\n```\n\n## Parallel computing\n\nrfacts makes it straightforward to parallelize across simulations. First, use `run_flfll()` to create a directory of param files. The example below uses a `tempfile()` to store the param files (i.e. `output_path`).  However, for distributed computing on traditional HPC clusters, `output_path` should be a directory path that all nodes can access.\n\n```{r}\nlibrary(rfacts)\nfacts_file \u003c- get_facts_file_example(\"contin.facts\")\n# On traditional HPC clusters, this should be a shared directory\n# instead of a temp directory:\ntmp \u003c- fs::dir_create(tempfile())\nparam_files \u003c- file.path(tmp, \"param_files\")\nrun_flfll(facts_file, param_files)\n```\n\nNext, write a custom function that accepts the param files, runs a single simulation for each param file, and returns the important data in memory. Be sure to set a unique seed for each simulation iteration.\n\n```{r}\nsim_once \u003c- function(iter, param_files) {\n  # Copy param files to a temp file in order to\n  # (1) Avoid race conditions in parallel processing, and\n  # (2) Make things run faster: temp files are on local node storage.\n  out \u003c- tempfile()\n  fs::dir_copy(path = param_files, new_path = out)\n  \n  # Run the engine once per param file.\n  run_engine_contin(out, n_sims = 1L, seed = iter)\n  \n  # Return aggregated patients files.\n  read_patients(out) # Reads fast because `out` is a tempfile().\n}\n```\n\nAt this point, we should test this function locally without parallel computing.\n\n```{r}\nlibrary(dplyr)\n\n# All the patients files were named patients00001.csv,\n# so do not trust the facts_sim column.\n# For data post-processing, use the facts_id column instead.\nlapply(seq_len(4), sim_once, param_files = param_files) %\u003e%\n  bind_rows()\n```\n\nParallel computing happens when we call `sim_once()` repeatedly over several parallel workers. A powerful and convenient parallel computing solution is [`clustermq`](https://mschubert.github.io/clustermq/). Here is a sketch of how to use it with `rfacts`. `mclapply()` from the `parallel` package is a quick and dirty alternative. \n\n```{r, eval = FALSE}\n# Configure clustermq to use our grid and your template file.\n# If you are using a scheduler like SGE, you need to write a template file\n# like clustermq.tmpl. To learn how, visit\n# https://mschubert.github.io/clustermq/articles/userguide.html#configuration-1\noptions(clustermq.scheduler = \"sge\", clustermq.template = \"clustermq.tmpl\")\n\n# Run the computation.\nlibrary(clustermq)\npatients \u003c- Q(\n  fun = sim_once,\n  iter = seq_len(50),\n  const = list(params = params),\n  pkgs = c(\"fs\", \"rfacts\"),\n  n_jobs = 4\n) %\u003e%\n  bind_rows()\n\n# Show aggregated patient data.\npatients\n```\n\nAlternatives to `clustermq` include `parallel::mclapply()`, `furrr::future_map()`, and `future.apply::future_lapply()`.\n\n## Helpers\n\nVarious `get_facts_*()` functions interrogate FACTS files.\n\n```{r}\nget_facts_scenarios(facts_file)\nget_facts_version(facts_file)\nget_facts_versions()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felilillyco%2Frfacts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felilillyco%2Frfacts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felilillyco%2Frfacts/lists"}