https://github.com/s3alfisc/wildrwolf

Romano-Wolf p-value adjustments for multiple hypotheses testing via the wild bootstrap for objects of type fixest and fixest_multi from the fixest package
https://github.com/s3alfisc/wildrwolf

fixest multiple-comparisons r romano-wolf wild-bootstrap wild-cluster-bootstrap

Last synced: 3 months ago
JSON representation

Romano-Wolf p-value adjustments for multiple hypotheses testing via the wild bootstrap for objects of type fixest and fixest_multi from the fixest package

Host: GitHub
URL: https://github.com/s3alfisc/wildrwolf
Owner: s3alfisc
License: gpl-3.0
Created: 2021-06-13T15:46:54.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2024-01-14T11:28:25.000Z (over 2 years ago)
Last Synced: 2024-03-14T22:10:21.913Z (over 2 years ago)
Topics: fixest, multiple-comparisons, r, romano-wolf, wild-bootstrap, wild-cluster-bootstrap
Language: R
Homepage: https://s3alfisc.github.io/wildrwolf/
Size: 4.56 MB
Stars: 6
Watchers: 3
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md

Awesome Lists containing this project

README

          ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# wildrwolf `r emo::ji("wolf")`

[![R-CMD-check](https://github.com/s3alfisc/wildrwolf/workflows/R-CMD-check/badge.svg)](https://github.com/s3alfisc/wildrwolf/actions)

[![](http://cranlogs.r-pkg.org/badges/last-month/wildrwolf)](https://cran.r-project.org/package=wildrwolf)

[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html)

[![](https://www.r-pkg.org/badges/version/wildrwolf)](https://cran.r-project.org/package=wildrwolf)

![runiverse-package](https://s3alfisc.r-universe.dev/badges/wildrwolf)

[![Codecov test coverage](https://codecov.io/gh/s3alfisc/wildrwolf/branch/main/graph/badge.svg)](https://app.codecov.io/gh/s3alfisc/wildrwolf?branch=main)

The `wildrwolf` package implements Romano-Wolf multiple-hypothesis-adjusted p-values for objects of type `fixest` and `fixest_multi` from the `fixest` package via a wild (cluster) bootstrap. 

Because the bootstrap-resampling is based on the [fwildclusterboot](https://github.com/s3alfisc/fwildclusterboot) package, `wildrwolf` is usually really fast. 

The package is complementary to [wildwyoung](https://github.com/s3alfisc/wildwyoung) (still work in progress), which implements the multiple hypothesis adjustment method following Westfall and Young (1993).

Adding support for multi-way clustering is work in progress.

## Installation

You can install the package from CRAN and the development version from [GitHub](https://github.com/) with:

``` r

install.packages("wildrwolf")

# install.packages("devtools")

devtools::install_github("s3alfisc/wildrwolf")

# from r-universe (windows & mac, compiled R > 4.0 required)

install.packages('wildrwolf', repos ='https://s3alfisc.r-universe.dev')

```

## Example I

```{r, warning=FALSE, message = FALSE}

library(wildrwolf)

library(fixest)

set.seed(1412)

N <- 1000

X1 <- rnorm(N)

X2 <- rnorm(N)

rho <- 0.5

sigma <- matrix(rho, 4, 4); diag(sigma) <- 1

u <- MASS::mvrnorm(n = N, mu = rep(0, 4), Sigma = sigma)

Y1 <- 1 + 1 * X1 + X2 

Y2 <- 1 + 0.01 * X1 + X2 

Y3 <- 1 + 0.4 * X1 + X2

Y4 <- 1 + -0.02 * X1 + X2 

for(x in 1:4){

  var_char <- paste0("Y", x)

  assign(var_char, get(var_char) + u[,x])

}

data <- data.frame(Y1 = Y1,

                   Y2 = Y2,

                   Y3 = Y3,

                   Y4 = Y4,

                   X1 = X1,

                   X2 = X2,

                   #group_id = group_id,

                   splitvar = sample(1:2, N, TRUE))

fit <- feols(c(Y1, Y2, Y3, Y4) ~ csw(X1,X2),

             data = data,

             se = "hetero",

             ssc = ssc(cluster.adj = TRUE))

# clean workspace except for res & data

rm(list= ls()[!(ls() %in% c('fit','data'))])

res_rwolf1 <- wildrwolf::rwolf(

  models = fit,

  param = "X1", 

  B = 9999

)

pvals <- lapply(fit, function(x) pvalue(x)["X1"]) |> unlist()

# Romano-Wolf Corrected P-values

res_rwolf1

```

## Example II

```{r, warning = FALSE, message = FALSE}

fit1 <- feols(Y1 ~ X1 , data = data)

fit2 <- feols(Y1 ~ X1 + X2, data = data)

fit3 <- feols(Y2 ~ X1, data = data)

fit4 <- feols(Y2 ~ X1 + X2, data = data)

res_rwolf2 <- rwolf(

  models = list(fit1, fit2, fit3, fit4), 

  param = "X1",  

  B = 9999

)

res_rwolf2

```

## Performance

The above procedure with `S=8` hypotheses, `N=1000` observations and `k %in% (1,2)` parameters finishes in around 5 seconds.

```{r, warning = FALSE, message = FALSE}

if(requireNamespace("microbenchmark")){

  

  microbenchmark::microbenchmark(

    "Romano-Wolf" = wildrwolf::rwolf(

      models = fit,

      param = "X1", 

      B = 9999 

    ), 

    times = 1

  )

 

}

```

## But does it work? Monte Carlo Experiments

We test $S=6$ hypotheses and generate data as 

$$Y_{i,s,g} = \beta_{0} + \beta_{1,s} D_{i} + u_{i,g} + \epsilon_{i,s} $$

where $D_i = 1(U_i > 0.5)$ and $U_i$ is drawn from a uniform distribution, $u_{i,g}$ is a cluster level shock with intra-cluster correlation $0.5$, and the idiosyncratic error term is drawn from a multivariate random normal distribution with mean $0_S$ and covariance matrix 

```{r}

S <- 6

rho <- 0.5

Sigma <- matrix(rho, 6, 6)

diag(Sigma) <- 1

Sigma

```

with $\rho \geq 0$. We assume that $\beta_{1,s}= 0$ for all $s$. 

This experiment imposes a data generating process as in equation (9) in [Clarke, Romano and Wolf](https://docs.iza.org/dp12845.pdf), with an additional error term $u_g$ for $G=20$ clusters and intra-cluster correlation 0.5 and $N=1000$ observations. 

You can run the simulations via the `run_fwer_sim()` function attached in the package. 

```{r, message = FALSE, results = "hide"}

# note that this will take some time

res <- run_fwer_sim(

  seed = 76,

  n_sims = 1000,

  B = 499,

  N = 1000,

  s = 6, 

  rho = 0.5 #correlation between hypotheses, not intra-cluster!

)

```

Both Holm's method and `wildrwolf` control the family wise error rates, at both the 5 and 10% significance level. 

```{r}

res

```

## Comparison with Stata's rwolf package

```{r, eval = FALSE}

library(RStata)

# initiate RStata

    options("RStata.StataPath" = "\"C:\\Program Files\\Stata17\\StataBE-64\"")

    options("RStata.StataVersion" = 17)

# save the data set so it can be loaded into STATA

write.csv(data, "c:/Users/alexa/Dropbox/rwolf/inst/extdata/readme.csv")

# estimate with stata via Rstata

stata_program <- "

clear

set more off

import delimited c:/Users/alexa/Dropbox/rwolf/inst/data/readme.csv

set seed 1

rwolf y1 y2 y3 y4, indepvar(x1) controls(x2) reps(9999)

"

RStata::stata(stata_program, data.out = TRUE)

# Romano-Wolf step-down adjusted p-values

# 

# 

# Independent variable:  x1

# Outcome variables:   y1 y2 y3 y4

# Number of resamples: 9999

# 

# 

# ------------------------------------------------------------------------------

#    Outcome Variable | Model p-value    Resample p-value    Romano-Wolf p-value

# --------------------+---------------------------------------------------------

#                  y1 |    0.0000             0.0001              0.0001

#                  y2 |    0.3904             0.3755              0.6070

#                  y3 |    0.0000             0.0001              0.0001

#                  y4 |    0.9586             0.9596              0.9596

# ------------------------------------------------------------------------------

```

For comparison,  `wildrwolf` produces the following output:

```{r, warning = FALSE, message = FALSE, eval = FALSE}

models <- feols(c(Y1, Y2, Y3, Y4) ~ X1 + X2 

                 , data = data, se = "hetero")

```

```{r, include = FALSE}

models <- feols(c(Y1, Y2, Y3, Y4) ~ X1 + X2 

                 , data = data, se = "hetero")

```

```{r}

rwolf(models, param = "X1", B = 9999)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/s3alfisc/wildrwolf

Awesome Lists containing this project

README