https://github.com/stufield/monty-hall-paradox

Attempt to explain the Monty Hall paradox with a simple simulation
https://github.com/stufield/monty-hall-paradox

Last synced: 11 months ago
JSON representation

Attempt to explain the Monty Hall paradox with a simple simulation

Host: GitHub
URL: https://github.com/stufield/monty-hall-paradox
Owner: stufield
Created: 2020-08-22T19:37:00.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2024-11-15T15:21:26.000Z (over 1 year ago)
Last Synced: 2025-01-25T06:13:09.709Z (over 1 year ago)
Language: Makefile
Size: 325 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.Rmd

Awesome Lists containing this project

README

          ---

output: github_document

---

```{r setup, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "figs/README-"

)

library(tibble)

library(tidyr)

library(ggplot2)

library(patchwork)

options(width = 100)

# Set ggplot theme

thm <- ggplot2::theme_bw(base_size = 11, base_family = "") +

  ggplot2::theme(

    panel.background  = element_rect(fill = "transparent", colour = NA),

    plot.background   = element_rect(fill = "transparent", colour = NA),

    legend.position   = "top",

    legend.background = element_rect(fill = "transparent", colour = NA),

    legend.key        = element_rect(fill = "transparent", colour = NA),

    plot.title        = element_text(hjust = 0.5, size = ggplot2::rel(1.5)),

    plot.subtitle     = element_text(hjust = 0.5)

  )

theme_set(thm)

```

# [The Monty Hall Paradox](https://stufield.github.io/monty-hall-paradox)

Suppose you're on a game show, and you're given the choice of three doors:

Behind one door is a car; behind the others are goats. You pick a door, say

`A`, and the host, who knows where the care is located, opens another door, say

`C`, which has a goat. He then asks you, "Do you want to pick door `B`?" Is it

to your advantage to switch your choice?

### Key Assumptions

For the paradox to work, there are some key assumptions (rules) about the

host's behavior:

1. The host must always open a door that was not picked by the contestant

1. The host must always open a door to reveal a goat and never the car.

1. The host must always offer the chance to switch between the originally

   chosen door and the remaining closed door.

### Common Mistake

Most people come to the conclusion that switching does not matter because there

are two unopened doors and one car and that it is a 50/50 choice.  This would

be true if the host opens a door randomly, but that is not the case; the door

opened depends on the player's initial choice, so the assumption of

independence does not hold. As we see below, breaking the independence

assumption drastically alters the probabilities of the remaining unrevealed

doors.

--------------

### Solution

The key insight is that the host does *not* reveal the remaining (non-chosen)

doors randomely. He *always* reveals a goat, and thus has knowledge of where

the car actually is. Incorporating this information into the probability

calculation adjusts the probability, shifting it away from the newly revealed

door to the remaining unrevealed and unchosen door. In a way this represents a

Bayesian update to the probability of door `B` with the knowldege that the car

is *not* behind door `C`. The posterior probability of door `B` is updated from

0.33 -> 0.66 following the reveal that door `C` is not an option. 

The problem can be reduced to a binary problem (switch or stay):

1. The player chooses correctly and loses by switching (1/3)

1. The player chooses incorrectly and wins by switching (2/3)

| Door A | Door B | Door C | Stay Strategy | Switch Strategy |

|:------:|:------:|:------:|:-------------:|:---------------:|

| Goat   | Goat   | Car    | Wins goat     | Wins car        |

| Goat   | Car    | Goat   | Wins goat     | Wins car        |

| Car    | Goat   | Goat   | Wins car      | Wins goat       |

|        |        |        | P(car) = 1/3  | P(car) = 2/3    |

#### Visual: probability tree

The bifrucated tree below assumes the player has chosen Door 1:

![](monty-hall-tree.png)

----------

## Simple Simulation

Perhaps the easiest way to visualize the solution is through simulation. 

### Code

```{r code, eval = FALSE}

# Single Monty-Hall trial; win by switching?

mh_switch_win <- function() {

  true   <- sample(1:3, 1L)    # true correct door; 3 possible doors

  choose <- sample(1:3, 1L)    # player's door choice: 1/3

  # if player choses incorrect door,

  # player wins by switching (TRUE/FALSE)

  true != choose

}

```

|> Once you convince yourself that the probability of winning by

|>   switching is the same as the probability of choosing incorrectly,

|>   i.e. 1 - 1/3, the function can be simplified further.

```{r code2, eval = FALSE}

mh_switch_win <- function() {

  runif(1) > 1 / 3

}

```

Run the simulation with the `runif()` function directly rather 

than `mh_switch_win()`:

```{r simulate}

trials  <- 1000

sim_tbl <- tibble::tibble(

  n_sim           = seq_len(trials),

  switch_win      = withr::with_seed(833, runif(trials) > 1 / 3),

  stay_win        = !switch_win,

  sum_switch_wins = cumsum(switch_win),

  sum_stay_wins   = cumsum(stay_win),

  prob_switch_win = sum_switch_wins / (sum_switch_wins + sum_stay_wins),

  prob_stay_win   = 1 - prob_switch_win

)

# simulation results

sim_tbl

```

### Plot Simulations

```{r plot-sim, fig.width = 10.5, fig.height = 5}

# Cumulative wins

plot_df <- sim_tbl |>

  tidyr::pivot_longer(

  cols     = c(sum_switch_wins, sum_stay_wins),

  names_to = "strategy", values_to = "Wins"

)

p1 <- plot_df |>

  ggplot(aes(x = n_sim, y = Wins, color = strategy)) +

  geom_line(size = 1) +

  labs(y = "Cumulative Wins", x = "Trial") +

  ggtitle("Cumulative Wins by Strategy")

# Prob winning

plot_df <- sim_tbl |>

  tidyr::pivot_longer(

  cols     = c(prob_switch_win, prob_stay_win),

  names_to = "strategy", values_to = "prob"

)

p2 <- plot_df |>

  ggplot(aes(x = n_sim, y = prob, color = strategy)) +

  geom_line(size = 1) +

  ylim(c(0, 1)) +

  geom_hline(yintercept = sim_tbl$prob_switch_win[trials], linetype = "dashed") +

  labs(y = "P(win)", x = "Trial",

       subtitle = sprintf("P(switch win) = %0.2f", sim_tbl$prob_switch_win[trials])) +

  ggtitle("Probability of Winning by Strategy")

p1 + p2

```

-------------

### Links

[https://en.wikipedia.org/wiki/Monty_Hall_problem](https://en.wikipedia.org/wiki/Monty_Hall_problem)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stufield/monty-hall-paradox

Awesome Lists containing this project

README