https://github.com/langcog/tidyboot

tidyverse-compatible bootstrapping
https://github.com/langcog/tidyboot

Last synced: 19 days ago
JSON representation

tidyverse-compatible bootstrapping

Host: GitHub
URL: https://github.com/langcog/tidyboot
Owner: langcog
Created: 2017-08-25T16:52:13.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2020-11-12T23:03:34.000Z (almost 5 years ago)
Last Synced: 2025-09-04T15:38:37.044Z (2 months ago)
Language: R
Homepage:
Size: 37.1 KB
Stars: 20
Watchers: 5
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.Rmd

Awesome Lists containing this project

jimsghstars - langcog/tidyboot - tidyverse-compatible bootstrapping (R)

README

          ---

output: github_document

---

```{r, echo = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "README-",

  message = FALSE,

  warning = FALSE

)

```

# tidyboot

`tidyboot` let's you compute arbitrary non-parametric bootstrap statistics on data in tidy data frames.

## Installation

You can install tidyboot from CRAN with:

```{r, eval = FALSE}

install.packages("tidyboot")

```

You can install tidyboot from github with:

```{r gh-installation, eval = FALSE}

# install.packages("devtools")

devtools::install_github("langcog/tidyboot")

```

## Examples

For the simplest use case of bootstrapping the mean and getting the mean and confidence interval of that estimate, use the convenience function `tidyboot_mean()`, specifying which column has the relevant values to compute the mean over:

```{r}

library(dplyr)

library(tidyboot)

gauss1 <- data_frame(value = rnorm(500, mean = 0, sd = 1), condition = 1)

gauss2 <- data_frame(value = rnorm(500, mean = 2, sd = 3), condition = 2)

df <- bind_rows(gauss1, gauss2)

df %>%

  group_by(condition) %>%

  tidyboot_mean(column = value)

```

For bootstrapping any statistic and any properties of its sampling distribution, use `tidyboot()`.

You can provide the statistic to be estimated either as a function and a column to compute it over, or as function that takes the whole dataframe and computes the relevant value.

Similarly, you can provide the properties of the sampling distribution to be computed either as a named list of functions and a column to compute them over, or a function that takes the whole dataframe and returns the relevant values.

```{r}

df %>%

  group_by(condition) %>%

  tidyboot(column = value, summary_function = median,

           statistics_functions = list("mean" = mean, "sd" = sd))

```

```{r}

df %>%

  group_by(condition) %>%

  tidyboot(summary_function = function(x) x %>% summarise(median = median(value)),

           statistics_functions = function(x) x %>% summarise_at(vars(median), funs(mean, sd)))

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/langcog/tidyboot

Awesome Lists containing this project

README