Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jackmwolf/tehtuner
An R Package to Fit and Tune to Models to Detect Treatment Effect Heterogeneity
https://github.com/jackmwolf/tehtuner
clinical-trials heterogeneity-of-treatment-effect r subgroup-identification
Last synced: 3 months ago
JSON representation
An R Package to Fit and Tune to Models to Detect Treatment Effect Heterogeneity
- Host: GitHub
- URL: https://github.com/jackmwolf/tehtuner
- Owner: jackmwolf
- License: gpl-3.0
- Created: 2021-11-06T21:50:51.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-06-27T03:31:10.000Z (over 1 year ago)
- Last Synced: 2023-11-20T10:44:59.597Z (about 1 year ago)
- Topics: clinical-trials, heterogeneity-of-treatment-effect, r, subgroup-identification
- Language: R
- Homepage:
- Size: 1010 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md
Awesome Lists containing this project
README
---
output: github_document
editor_options:
chunk_output_type: console
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)library(ggplot2)
library(ggtext)
library(magrittr)
library(kableExtra)
library(Cairo)
library(rpart.plot)
```# tehtuner
[![CRAN status](https://www.r-pkg.org/badges/version/tehtuner)](https://CRAN.R-project.org/package=tehtuner)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.05453/status.svg)](https://doi.org/10.21105/joss.05453)
[![R-CMD-check](https://github.com/jackmwolf/tehtuner/workflows/R-CMD-check/badge.svg)](https://github.com/jackmwolf/tehtuner/actions)The goal of `tehtuner` is to implement methods to fit models to detect and model
treatment effect heterogeneity (TEH) while controlling the Type-I error of falsely
detecting a differential effect when the conditional average treatment effect is
uniform across the study population.Currently `tehtuner` supports Virtual Twins models (Foster
et al., 2011) for detecting TEH using the permutation procedure proposed in (Wolf et al., 2022).Virtual Twins is a two-step approach to detecting differential treatment
effects. Subjects' conditional average treatment effects (CATEs) are first
estimated in Step 1 using a flexible model. Then, a simple and interpretable
model is fit in Step 2 to model these estimated CATEs as a function of the
covariates.The Step 2 model is dependent on some tuning parameter. This parameter is
selected to control the Type-I error rate by permuting the data under the
null hypothesis of a constant treatment effect and identifying the minimal
null penalty parameter (MNPP), which is the smallest penalty parameter that
yields a Step 2 model with no covariate effects. The $1-\alpha$ quantile
of the distribution of is then used to fit the Step 2 model on the original
data.
In dong so, the Type-I error rate is controlled to be $\alpha$.## Installation
`tehtuner` is available on [CRAN](https://CRAN.R-project.org); you can download the release version with:
``` r
install.packages("tehtuner")
```You can download the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("jackmwolf/tehtuner")
```
## ExampleWe consider simulated data from a small clinical trial with 1000 subjects.
Each subject has 10 measured covariates, 8 continuous and 2 binary.
We are interested in estimating and understanding the CATE through Virtual Twins.```{r}
library(tehtuner)
data("tehtuner_example")
```We will consider a Virtual Twins model using a random forest to estimate the CATEs in Step 1 and then fitting a regression tree on the estimated CATEs in Step 2 with the Type-I error rate set at $\alpha = 0.2$.
```{r cache=TRUE}
set.seed(100)
vt_cate <- tunevt(
data = tehtuner_example, Y = "Y", Trt = "Trt", step1 = "randomforest",
step2 = "rtree", alpha0 = 0.2, p_reps = 100, ntree = 50
)
vt_cate
```The fitted Step 2 model can be accessed via `$vtmod`.
In this case, as we used a regression tree in Step 2, our final model model is of class `rpart.object`.
```{r dev='CairoPNG', warning = FALSE}
vt_cate$vtmodrpart.plot::rpart.plot(vt_cate$vtmod, digits = -2)
```The fitted model for the CATE is a function of the covariates (`V1`, and `V3`), so we would conclude that there is treatment effect heterogeneity at the 20% level.
We can also look at the null distribution of the MNPP through `vt_cate$theta_null`.
The 80th percentile of $\hat\theta$ under the null hypothesis is
```{r}
quantile(vt_cate$theta_null, 0.8)
```while the MNPP of our observed data is
```{r}
vt_cate$mnpp
```The procedure fit the Step 2 model using the 80th quantile of the null distribution which resulted in a model that included covariates since the MNPP was above the 80th quantile.
```{r mnpp_plot, dev='CairoPNG', echo = FALSE}
ggplot(mapping = aes(x = vt_cate$theta_null, y = after_stat(density))) +
geom_histogram(color = "black", fill = "white", binwidth = 0.025) +
theme_minimal() +
scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
geom_vline(
aes(xintercept = c(vt_cate$mnpp, quantile(vt_cate$theta_null, 0.8))),
color = c("#0072B2", "#D55E00"),
linewidth = 2,
linetype = 2
) +
labs(
y = "Density",
x = expression(hat(theta)),
title = expression("Sampling distribution of" ~ hat(theta) ~ "under" ~ H[0]),
subtitle = paste0(
"",
"80th quantile (critical value): ", formatC(quantile(vt_cate$theta_null, 0.8), digits = 2, format = "f"),
"",
"; ",
"",
"Observed MNPP: ", formatC(vt_cate$mnpp, 2, format = "f"),
""
)
) +
theme(
plot.subtitle = element_markdown(size = 12)
)
```### Running in Parallel
Version `0.2.0` added the `parallel` option to `tunevt()` which allows the user to perform the permutation procedure in parallel to reduce computation times.
Before doing so, you must register a parallel backend; see `?foreach::foreach` for more information.For example, to carry out 100 permutations across 2 processors:
```{r eval = FALSE}
cl <- parallel::makeCluster(2)
doParallel::registerDoParallel(cl)vt_cate_parallel <- tunevt(
data = tehtuner_example, Y = "Y", Trt = "Trt", step1 = "randomforest",
step2 = "rtree", alpha0 = 0.2, p_reps = 100, ntree = 50, parallel = TRUE
)parallel::stopCluster(cl)
```## References
- Foster, J. C., Taylor, J. M., & Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. _Statistics in Medicine, 30_(24), 2867–2880. https://doi.org/10.1002/sim.4322
- Wolf, J. M., Koopmeiners, J. S., & Vock, D. M. (2022). A permutation procedure to detect heterogeneous treatment effects in randomized clinical trials while controlling the type-I error rate. _Clinical Trials, 19_(5). https://doi.org/10.1177/17407745221095855
- Deng C., Wolf J. M., Vock D. M., Carroll D. M., Hatsukami D. K., Leng N., & Koopmeiners J. S. (2023). “Practical guidance on modeling choices for the virtual twins method.” _Journal of Biopharmaceutical Statistics_. https://doi.org/10.1080/10543406.2023.2170404
- Wolf, J. M., (2023). tehtuner: An R package to fit and tune models for the conditional average treatment effect. _Journal of Open Source Software, 8_(86), 5453. https://doi.org/10.21105/joss.05453