{"id":17801405,"url":"https://github.com/evamaerey/tidybernoulli","last_synced_at":"2025-06-22T11:07:56.719Z","repository":{"id":164537298,"uuid":"638663107","full_name":"EvaMaeRey/tidybernoulli","owner":"EvaMaeRey","description":"probability branching in data frames","archived":false,"fork":false,"pushed_at":"2023-07-18T19:26:38.000Z","size":673,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-22T11:07:39.399Z","etag":null,"topics":["matrix","probability","tidy-data"],"latest_commit_sha":null,"homepage":"https://evamaerey.github.io/tidybernoulli/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EvaMaeRey.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-05-09T20:46:11.000Z","updated_at":"2023-07-18T19:30:02.000Z","dependencies_parsed_at":"2023-11-17T02:00:21.373Z","dependency_job_id":"d3b059b8-24f8-430e-9128-7776a77f8241","html_url":"https://github.com/EvaMaeRey/tidybernoulli","commit_stats":{"total_commits":8,"total_committers":1,"mean_commits":8.0,"dds":0.0,"last_synced_commit":"bda42c81f7a85ae919c15aefe0b199fcff640d45"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EvaMaeRey/tidybernoulli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvaMaeRey%2Ftidybernoulli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvaMaeRey%2Ftidybernoulli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvaMaeRey%2Ftidybernoulli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvaMaeRey%2Ftidybernoulli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EvaMaeRey","download_url":"https://codeload.github.com/EvaMaeRey/tidybernoulli/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvaMaeRey%2Ftidybernoulli/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261282321,"owners_count":23134940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["matrix","probability","tidy-data"],"created_at":"2024-10-27T12:38:09.228Z","updated_at":"2025-06-22T11:07:51.708Z","avatar_url":"https://github.com/EvaMaeRey.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# tidybernoulli\n\n\u003c!-- badges: start --\u003e\n\u003c!-- badges: end --\u003e\n\nThe goal of tidybernoulli creates a framework work with independent, repeated trials in an intuitive, fluid, and computation-friendly way.\n\nA Bernoulli trial is an independent trial with two outcomes (usually success and a failure), where probabilities associated with each new trial are independent of previous trials. \n\nInstead of looking at the realization of single trials and trials histories, we look at probability distributions that are generated by adding Bernoulli trials. Data frames that are generated contain a row for each outcome history and two columns for each trial index -- one column with the trial outcome and one column with the associated probability.  \n\nOnce all the outcome-probability pathways have been built up, summary functions allow us to ask questions about global outcomes; e.g. how likely are we to observe at least one success in 5 fair coin flips. Students will be able to see distributions like the binomial distribution emerge from first principles.  \n\ntidybernoulli was inspired by and is complementary to [ma206distributions](https://evamaerey.github.io/ma206distributions/) which treats common discrete probability distributions consistent with data requirements for use in with ggplot2 and the rest of the tidyverse.\n\n## Installation\n\nYou can install the development version of tidybernoulli from [GitHub](https://github.com/) with:\n\n``` r\n# install.packages(\"devtools\")\ndevtools::install_github(\"EvaMaeRey/tidybernoulli\")\n```\n\nThen load the package.\n\n```{r}\nlibrary(tidybernoulli)\n```\n\n# Single trials\n\nWe provide some single Bernoulli trial functions\n\n```{r}\nbernoulli_trial()\nweighted_coin()\nfair_coin()\n```\n\nAs well as few non-bernoulli events and probabilities\n\n```{r}\nprize_wheel()\n```\n\n# Multiple trials\n\nThis is a basic example which shows you how to solve a common problem:\n\n```{r example}\n## basic example code\n```\n\n\n```{r cars}\nbernoulli_trial()\n\ntrial_init() |\u003e\n  add_trials()\n\ntrial_init() |\u003e\n  add_trials() |\u003e\n  add_trials() \n```\n\n\n# Summarizing possible outcome histories\n\n```{r}\nlibrary(magrittr)\ntrial_init(prob = .3) %\u003e%\n  add_trials() %\u003e%\n  add_trials() %\u003e%\n  .$out %\u003e%\n  sum_across() %\u003e%\n  prod_across()\n```\n\n\n\n```{r}\nlibrary(magrittr)\nbernoulli_trial(prob = .5) %\u003e%\n  trial_init() %\u003e% \n  add_trials() %\u003e%\n  add_trials() %\u003e%\n  add_trials(5) %\u003e%\n  .$out %\u003e%\n  sum_across() %\u003e%\n  prod_across()\n```\n\n# Further summary based on outcome of interest...\n\n```{r}\nlibrary(dplyr)\nbernoulli_trial(prob = .5) %\u003e%\n  add_trials() %\u003e% \n  add_trials() %\u003e%\n  add_trials() %\u003e%\n  add_trials(3) %\u003e%\n  .$out %\u003e%\n  sum_across() %\u003e%\n  prod_across() %\u003e%\n  group_by(global_outcome) %\u003e%\n  summarize(probs = sum(global_probs))\n\n```\n\n# Cross-validate work\n\n```{r}\ndbinom(x =  0:7, size = 7, prob = .5)\n\n```\n\n---\n\nor...\n\n```{r}\nbernoulli_trial(prob = .5) |\u003e\n  add_trials() |\u003e\n  add_trials() |\u003e\n  to_tsibble()  |\u003e\n  group_by(history)  |\u003e\n  summarize(hist_prob = prod(prob),\n            count_successes = sum(outcome),\n            paths = paste(outcome, collapse = \",\")) |\u003e\n  arrange(count_successes) |\u003e\n  group_by(count_successes) |\u003e\n  summarize(count_prob = sum(hist_prob))\n\n```\n\n---\n\n# drob quick job on veridical paradox\n\n\u003e A #tidyverse simulation to demonstrate that if you wait for two heads in a row, it takes 6 flips on average, while you wait for a heads then a tails, it takes 4 flips on average\n\n\n```{r}\nlibrary(tidyverse)\n\n# drob\ncrossing(trial = 1:1000,\n         flip = 1:100) %\u003e% \n  mutate(heads = rbinom(n(), 1, .5)) %\u003e% \n  group_by(trial) %\u003e% \n  mutate(next_flip = lead(heads),\n         hh = heads \u0026 next_flip,\n         ht = heads \u0026 !next_flip) %\u003e% \n  summarise(first_hh = which(hh)[1] + 1, \n            first_ht = which(ht)[1] + 1) %\u003e% \n  summarise(first_hh = mean(first_hh),\n            first_ht = mean(first_ht))\n\n```\n\nIt's about the second chances... \n\n```{r}\noptions(pillar.print_max = Inf)\nfair_coin(outcome_set = c(\"T\", \"H\")) %\u003e% \n  select(-prob) %\u003e% \n  trial_init() %\u003e% \n  add_trials() %\u003e% \n  add_trials() %\u003e% \n  add_trials() %\u003e%\n  add_trials() %\u003e%\n  add_trials() %\u003e%\n  to_tsibble() %\u003e% \n  group_by(history) %\u003e% \n  ggplot() + \n  aes(y = history, x = trial) + \n  geom_tile(color = \"white\") + \n  aes(fill = outcome) -\u003e\nbaseplot; baseplot\n\nbaseplot + \n  geom_point(data = . %\u003e% filter( outcome == \"H\" \u0026 lag(outcome) == \"H\"), color = \"darkred\") \n\nbaseplot + \n  geom_point(data = . %\u003e% filter( outcome == \"T\" \u0026 lag(outcome) == \"H\"), color = \"darkred\")\n```\n\n## 16 dolphin trials\n\n```{r}\nbernoulli_trial(prob = .5) |\u003e\n  add_trials() |\u003e\n  add_trials() |\u003e\n  to_tsibble()  |\u003e\n  group_by(history)  |\u003e\n  summarize(hist_prob = prod(prob),\n            count_successes = sum(outcome),\n            paths = paste(outcome, collapse = \",\")) |\u003e\n  arrange(count_successes) |\u003e\n  group_by(count_successes) |\u003e\n  summarize(prob = sum(hist_prob))\n\noptions(scipen = 10)\nbernoulli_trial(prob = .5) |\u003e\n  add_trials(15) |\u003e\n  to_tsibble()  |\u003e\n  group_by(history)  |\u003e\n  summarize(hist_prob = prod(prob),\n            count_successes = sum(outcome),\n            paths = paste(outcome, collapse = \",\")) |\u003e\n  arrange(count_successes) |\u003e\n  group_by(count_successes) |\u003e\n  summarize(prob = sum(hist_prob))\n\n\ncollapse \u003c- function(x, collapse = \", \"){\n  paste(x, collapse = collapse)\n}\n\nbernoulli_trial(prob = .5, outcome_set = c(\"nope\", \"fish\")) |\u003e\n  add_trials(15) |\u003e\n  to_tsibble() %\u003e% \n  group_by(history) %\u003e% \n  summarise(history = collapse(outcome),\n            sum_successes = sum(outcome == \"fish\"),\n            prob = prod(prob)) %\u003e% \n  group_by(sum_successes) %\u003e% \n  summarise(prob = sum(prob))\n```\n\n## Generalizing and simplifying to binomial equation...\n\n```{r}\nma206equations::typeset_eq_binomial()\nma206equations::typeset_eq_choose()\n```\n\n\n${{_N}C{_k}} \\cdot p^kq^{N-k}$\n\nwhere\n\n$C = n!\\(r!*(n-r)!)$\n\n\n## Quick viz...\n\n\n```{r}\nma206distributions::tidy_dbinom(num_trials = 16, single_trial_prob = .5) %\u003e% \n  ggplot() + \n  aes(x = num_successes,\n      y = probability) + \n  ma206distributions::geom_lollipop(annotate = TRUE) + \n  labs(title = \"prob distributions of successes if random\") + \n  ma206equations::stamp_eq_binomial(x = 3, y = .1, size = 8)\n```\n\n\n\n\n# Peek into internals of tidybernoulli\n\n```{r}\nreadLines(\"R/bernoulli-trial.R\")[150:200]\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevamaerey%2Ftidybernoulli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevamaerey%2Ftidybernoulli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevamaerey%2Ftidybernoulli/lists"}