Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/bdilday/GeomMLBStadiums

Geoms to draw MLB stadiums in ggplot2
https://github.com/bdilday/GeomMLBStadiums

baseball ggplot2-geoms r

Last synced: 3 months ago
JSON representation

Geoms to draw MLB stadiums in ggplot2

Lists

README

        

---
output: github_document
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
# fig.path = "README-",
fig.height = 6,
fig.width = 6,
fig.units = "in"
)
```

# GeomMLBStadiums

This package defines a couple of Geoms to draw MLB stadiums in ggplot2. It also provides a Geom to draw a "spraychart" - `x` and `y` locations of batted balls with a stadium overlay.

## Example use

### Install from github and load the necessary libraries

``` {r echo=FALSE, message=FALSE, warning=FALSE}
library(GeomMLBStadiums)
library(ggplot2)
library(dplyr)
```

``` {r eval=FALSE, message=FALSE}
devtools::install_github("bdilday/GeomMLBStadiums")
library(GeomMLBStadiums)
library(ggplot2)
library(dplyr)
```

### The stadium data

When you load the `GeomMLBStadiums` package it will attach the stadium paths as a data frame, `MLBStadiumsPathData`

``` {r}
head(MLBStadiumsPathData)
```

The data comprise the 30 current MLB stadiums, in addition to a "generic" stadium. The stadia are identified by team name, with the following conventions

``` {r}
unique(MLBStadiumsPathData$team)
```

The segments are split up into `outfield_outer`, `outfield_inner`, `infield_inner`, `infield_outer`, `foul_lines`, and `home_plate`

``` {r}
unique(MLBStadiumsPathData$segment)
```

### Coordinates

The stadium paths are in the system of the `hc_x` and `hc_y` coordinates of MLBAM. These are inverted (because they're based on a display device where `y=0` is at top, IIRC) which means by default the field gets displayed upside down. This package provides a helper function, `mlbam_xy_transformation`, that transforms these values to a system where y increases from bottom to top and home plate is located at `(0, 0)`.

``` {r}
set.seed(101)
batted_ball_data = data.frame(hc_x = rnorm(20, 125, 10),
hc_y = rnorm(20, 100, 20))

head(batted_ball_data)

head(mlbam_xy_transformation(batted_ball_data))

summary(mlbam_xy_transformation(batted_ball_data))

```

### `geom_mlb_stadium`

This uses `geom_mlb_stadium`, which implicitly loads the `MLBStadiumsPathData` data, to plot the 30 current stadiums.

``` {r}
ggplot() +
geom_mlb_stadium(stadium_ids = "all_mlb",
stadium_segments = "all") +
facet_wrap(~team) +
coord_fixed() +
theme_void()
```

An alternative way is to explicitly pass the data to `geom_path`.

``` {r}
MLBStadiumsPathData %>%
filter(team != 'generic') %>%
mutate(g=paste(team, segment, sep="_")) %>%
ggplot(aes(x, y)) +
geom_path(aes(group=g)) +
facet_wrap(~team) +
coord_fixed() +
theme_void()
```

This shows the generic stadium, which is the default,

``` {r}
ggplot() +
geom_mlb_stadium(stadium_segments = "all") +
facet_wrap(~team) +
coord_fixed() +
theme_void()
```

### `geom_spraychart`

This generates some simulated data.

``` {r}
# first generate the data
set.seed(101)
batted_ball_data = data.frame(hc_x = rnorm(20, 125, 10),
hc_y = rnorm(20, 100, 20))
batted_ball_data$team = rep(c("angels", "yankees"), each=10)
```

This plots the data as a spraychart. By default it uses the "generic" stadium.

``` {r}
batted_ball_data %>%
ggplot(aes(x=hc_x, y=hc_y)) +
geom_spraychart()
```

Add some styling using `theme_void` and `coord_fixed`

``` {r}
batted_ball_data %>%
ggplot(aes(x=hc_x, y=hc_y)) +
geom_spraychart() +
theme_void() +
coord_fixed()
```

This transforms the data and the stadium before plotting, passes the team names in `stadium_ids`, draws all segments, and facets by field.

``` {r}
batted_ball_data %>% mlbam_xy_transformation() %>%
ggplot(aes(x=hc_x_, y=hc_y_, color=team)) +
geom_spraychart(stadium_ids = unique(batted_ball_data$team),
stadium_transform_coords = TRUE,
stadium_segments = "all") +
theme_void() +
coord_fixed() +
facet_wrap(~team) +
theme(legend.position = "bottom")
```

You can make use of any of the other `ggplot2` functions, for example, contours from `stat_density2d`. The `mapping` argument for `geom_spraychart` gets passed to the underlying `geom_point`, as do any extra parameters passed into the `...` argument of `geom_spraychart`, e.g. `size=5` in the below.

``` {r}
batted_ball_data %>% mlbam_xy_transformation() %>%
ggplot(aes(x=hc_x_, y=hc_y_, color=team)) +
geom_spraychart(mapping = aes(shape=team),
stadium_ids = unique(batted_ball_data$team),
stadium_transform_coords = TRUE,
stadium_segments = "all", size=5) +
theme_void() +
coord_fixed() +
facet_wrap(~team) +
theme(legend.position = "bottom") +
stat_density2d(color='gray')
```