Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/evamaerey/sf2stat

helpers to get data ready to be used in stat or geom layer w ggplot2
https://github.com/evamaerey/sf2stat
Last synced: 12 days ago
JSON representation
helpers to get data ready to be used in stat or geom layer w ggplot2
Host: GitHub
URL: https://github.com/evamaerey/sf2stat
Owner: EvaMaeRey
License: other
Created: 2024-03-05T04:35:06.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-02-04T18:48:05.000Z (15 days ago)
Last Synced: 2025-02-04T19:39:31.769Z (15 days ago)
Language: R
Homepage: https://evamaerey.github.io/sf2stat/
Size: 14.9 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project

README

        ---

output: 

  github_document:

    toc: TRUE

---

  [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)

    

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  eval = T,

  warning = F,

  message = T,

  error = T

)

library(tidyverse)

```

# Part 0. Proposal

Proposing the {sf2stat} package! 🦄 

The goal of {sf2stat} is to make it easier to prep *sf data* for use in a ggproto Stat computation; the Stat then can be used for creating a stat/geom function to be used in ggplot2 plots.

Without the package, we live in the effortful world, in which we'd have to prep our own data including figuring out the bounding box for each geometry, and, if we want labeling functionality, the centroid for each geometry.  

With the {sf2stat} package, we'll live in a different world (🦄 🦄 🦄) where the task is a snap 🫰: 

Proposed API is:

```

library(ggregion)

--

--

read.csv("nc-midterms.csv") |>

  ggplot() + 

  aes(county_name = str_to_title(desc_county)) + 

  geom_region() + 

  aes(fill = cd_party) +

  geom_region_text()

```

# Package build Part I. Work out functionality  ✅ 

In this section we'll use the nc sf dataframe to check out how our functions work.

## Select toy sf data

```{r}

nc_ref <- sf::st_read(system.file("shape/nc.shp", package="sf")) |>

  select(county_name = NAME, fips = FIPS)

read.csv("nc-midterms.csv") |>

  mutate(county_name = str_to_title(desc_county)) |>

  left_join(nc_ref) %>% 

  ggplot() + 

  geom_sf() +

  aes(fill = cd_party, 

      label = county_name,

      geometry = geometry)+

  geom_sf_text(check_overlap = T)

```

```{r}

# we want our stat to do stuff that StatSf and StatSfCoordinates does.

prep_geo_reference <- function(ref_data, id_index = 1){

  

  ref_data |>

  ggplot2::StatSf$compute_panel(coord = ggplot2::CoordSf) |>

  ggplot2::StatSfCoordinates$compute_group(coord = ggplot2::CoordSf) %>% 

    mutate(id_col = .[[id_index]])

  

}

compute_panel_aggregation <- function(data, scales, fun = sum, non_grouping = c("fill", "wt", "within")){

  

  grp_cols <-  names(data)[!names(data) %in% non_grouping]

  

  # Thanks June! https://github.com/teunbrand/ggplot-extension-club/discussions/15

  data %>% 

    group_by(group_by(pick(any_of(grp_cols)))) ->   

  data

  

  if(is.null(data$fill)){data <- mutate(data, fill = 1)}

  if(is.null(data$wt)){data$wt <- 1}

  

  data %>% 

    summarize(fill = fun(.data$fill*.data$wt), .groups = 'drop') |>

    mutate(summary = fill) ->

  data

    

  if(is.null(data$within)){data$within <- 1}

  data %>%   

    group_by(.data$within) %>% 

    mutate(prop = .data$fill/sum(.data$fill)) %>%

    mutate(percent = round(.data$prop*100)) -> 

  data

  

  data

  

}

# Flip the script... prepare compute (join) to happen in layer (NEW!)

compute_panel_region <- function(data, scales, ref_data, id_index = 1,

                                 stamp = FALSE, keep_id = NULL,

                                 drop_id = NULL, fun = sum){

  

  fill_is_category <- is.character(data$fill)|is.factor(data$fill)|is.logical(data$fill)

  

  if(!(fill_is_category)){

  

  data <- data |> compute_panel_aggregation(scales, fun = fun, non_grouping = c("fill", "wt", "within")) 

  

  }

  

  ref_data %>% 

    prep_geo_reference(id_index = id_index) ->

  ref_data

  

  if(!is.null(keep_id)){

  

  ref_data %>% 

    filter(id_col %in% keep_id) ->

  ref_data

  

    }

  

  if(!is.null(drop_id)){

  

  ref_data %>% 

    filter(!(id_col %in% drop_id)) ->

  ref_data

  

    }

  

  if(stamp){

    

    ref_data |>

      mutate(fill = ifelse(fill_is_category, NA, NA |> as.numeric()))

    

  }else{

  

 ref_data %>% 

    inner_join(data) 

      

  

  }

    

}

```

```{r}

nc_ref <- sf::st_read(system.file("shape/nc.shp", package="sf")) |>

  select(county_name = NAME, fips = FIPS)

read.csv("nc-midterms.csv") |>

  mutate(county_name = str_to_title(desc_county)) |>

  select(county_name) |>

  compute_panel_region(ref_data = nc_ref)

read.csv("nc-midterms.csv") |>

  mutate(county_name = str_to_title(desc_county)) |>

  select(county_name) |>

  compute_panel_region(ref_data = nc_ref, keep_id = "Mecklenburg")

```

# wrapping up more

## stat_region and friends

```{r}

# same as geom_sf but geom (and stat) is flexible

qlayer_sf_crs <- function (mapping = NULL, data = NULL, geom = "sf", 

                           stat = "sf", position = "identity", 

                           na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, 

                           crs, ...) {

  

    c(layer_sf(geom = geom, data = data, mapping = mapping, 

        stat = stat, position = position, show.legend = show.legend, 

        inherit.aes = inherit.aes, params = rlang::list2(na.rm = na.rm, 

            ...)), 

      coord_sf(crs = crs))

}

stat_region <- function(ref_data = getOption("sf2stat.ref_data", nc_ref), 

                        id_index = 1, 

                        required_aes = getOption("sf2stat.required_aes", "fips|county_name"),

                        geom = GeomSf, ...){

  

  StatSfJoin <- ggproto("StatSfJoin", Stat, 

                        compute_panel = compute_panel_region, 

                        default_aes = aes(label = after_stat(id_col), 

                                          fill = after_stat(fill)),

                        required_aes = required_aes)

  

  qlayer_sf_crs(stat = StatSfJoin, 

                geom = geom,

                ref_data = ref_data, 

                crs = sf::st_crs(ref_data), 

                id_index = id_index, ...)

  

}

#  geom_sf  # want to look at quieting the coord message...

GeomOutline <- ggproto("GeomOutline", GeomSf,

                       default_aes = aes(!!!modifyList(GeomSf$default_aes,

                                                       aes(fill = "transparent", 

                                                           color = "black"))))

geom_region_sf <- function(mapping = NULL, ...){stat_region(geom = GeomSf, mapping = mapping, ...)}

geom_region <- geom_region_sf   # convenience short name

geom_region_outline <- function(mapping = NULL, ...){stat_region(geom = GeomOutline, mapping = mapping, ...)}

geom_region_label <- function(mapping = NULL, ...){stat_region(geom = GeomLabel,mapping = mapping,...)}

geom_region_text <- function(mapping = NULL, ...){stat_region(geom = GeomText, mapping = mapping,...)}

geom_region_textrepel <- function(mapping = NULL, ...){stat_region(geom = ggrepel::GeomTextRepel, mapping = mapping, ...)}

stamp_region_sf <- function(...){geom_region_sf(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_region <- stamp_region_sf

stamp_region_outline <- function(...){geom_region_outline(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_region_label <- function(...){geom_region_label(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_region_text <- function(...){geom_region_text(stamp = T, required_aes = Stat$required_aes, ...)}

```

## stat_subregion and friends

```{r}

stat_subregion <- function(ref_data = getOption("sf2stat.ref_data_subregion", nc_ref), 

                        id_index = 1, 

                        required_aes = getOption("sf2stat.required_aes_subregion", "fips|county_name"),

                        geom = GeomSf, ...){

  

  StatSfJoin <- ggproto("StatSfJoin", Stat, 

                        compute_panel = compute_panel_region, 

                        default_aes = aes(label = after_stat(id_col)),

                        required_aes = required_aes)

  

  qlayer_sf_crs(stat = StatSfJoin, 

                geom = geom,

                ref_data = ref_data, 

                crs = sf::st_crs(ref_data), 

                id_index = id_index, ...)

  

}

geom_subregion_sf <- function(mapping = NULL, ...){stat_subregion(geom = GeomSf, mapping = mapping, ...)}

geom_subregion <- geom_subregion_sf   # convenience short name

geom_subregion_outline <- function(mapping = NULL, ...){stat_subregion(geom = GeomOutline, mapping = mapping, ...)}

geom_subregion_label <- function(mapping = NULL, ...){stat_subregion(geom = GeomLabel,mapping = mapping,...)}

geom_subregion_text <- function(mapping = NULL, ...){stat_subregion(geom = GeomText, mapping = mapping,...)}

geom_subregion_textrepel <- function(mapping = NULL, ...){stat_subregion(geom = ggrepel::GeomTextRepel,mapping = mapping, ...)}

stamp_subregion_sf <- function(...){geom_subregion_sf(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_subregion <- stamp_subregion_sf

stamp_subregion_outline <- function(...){geom_subregion_outline(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_subregion_label <- function(...){geom_subregion_label(stamp = T, required_aes = Stat$required_aes, ...)}

stamp_subregion_text <- function(...){geom_subregion_text(stamp = T, required_aes = Stat$required_aes, ...)}

```

```{r}

set_region <- function(ref_data, required_aes = NULL, region, level = "region", return_region_names = F){

  

  if(is.null(required_aes)){

    

    names_ref_data <- names(ref_data)

    required_aes <- paste(names_ref_data[names_ref_data != "geometry"],

                          collapse = "|")

    

  }

  

  if(level == "region"){

options(sf2stat.ref_data = ref_data,

        sf2stat.required_aes = required_aes)

message(paste0("Region is", region, "\nRequired aes: '", required_aes, "'"))

  }  

  

  if(level == "subregion"){

options(sf2stat.ref_data_subregion = ref_data,

        sf2stat.required_aes_subregion = required_aes)

message(paste0("Subregion is ", region, "\nRequired aes: '", required_aes, "'"))

  }  

if(return_region_names){

  ref_data[,1]

}

}

```

# Geo examples

many data packages linked to from here

https://github.com/ipeaGIT/geobr

## North Carolina counties

```{r}

set_region_sf_nc_counties <- function(return_region_names = F, region = "county"){

sf::st_read(system.file("shape/nc.shp", package="sf")) |>

  select(county_name = NAME, fips = FIPS) |>

  set_region(region = region, return_region_names = return_region_names)

}

nc_midterms <- read.csv("nc-midterms.csv")

head(nc_midterms)

set_region_sf_nc_counties()

nc_midterms |>

  ggplot() + 

  aes(county_name = str_to_title(desc_county)) + 

  stamp_region(fill = 'darkgrey') + 

  geom_region() + 

  aes(fill = n/1000) + 

  stamp_region_outline(

    keep_id = "Mecklenburg",

    color = "orange",

    linewidth = 1) + 

  geom_region_text(check_overlap = T,

                   color = "whitesmoke")

layer_data() |> head()

```

## Chile regiones

```{r}

set_region_region_chilemapas <- function(return_region_names = F){

chilemapas::generar_regiones() %>% 

  mutate(region_numerico = as.numeric(codigo_region)) %>%

  select(region_codigo = codigo_region,

         region_numerico)  |>

  set_region(region = "region", return_region_names = return_region_names)

  

}

```  

```{r}

set_region_region_chilemapas()

options(scipen = 10)

chilemapas::censo_2017_comunas %>% 

  mutate(region = str_extract(codigo_comuna, "..")) %>% 

  summarise(pop = sum(poblacion), .by = c(region, sexo)) %>% 

  ggplot() + 

  aes(region_codigo = region, fill = pop/100000) +

  geom_region(linewidth = .01, color = "white") + 

  facet_wrap(~sexo) + 

  scale_fill_viridis_c(transform = "log") + 

  stamp_region_outline(color = "red", 

                       keep_id = "05")

last_plot() + 

  geom_region_textrepel()

```

```{r}

library(ggstats)

chilemapas::censo_2017_comunas %>% 

  mutate(region = str_extract(codigo_comuna, "..")) %>% 

  summarise(pop = sum(poblacion), .by = c(region, sexo)) %>% 

  ggplot() +

  aes(fill = sexo, y = region, weight = pop) + 

  ggstats::geom_pyramid(fill = "grey") +

  ggstats::geom_pyramid()

chilemapas::censo_2017_comunas %>% 

  mutate(region = str_extract(codigo_comuna, "..")) %>% 

  summarise(pop = sum(poblacion), .by = c(region, sexo)) %>% 

  ggplot() +

  aes(fill = sexo, y = region, weight = pop) + 

  ggstats::geom_likert(fill = "grey") + 

  ggstats::geom_likert() + 

  ggstats::geom_likert_text()

```

## World countries rnaturalearth

```{r}

library(tidyverse)

set_region_country_rnaturalearth <- function(scale = "small", return_region_names = F){

rnaturalearth::ne_countries(

  scale = scale, returnclass = "sf") |> 

  select(country_name = sovereignt, iso3c = iso_a3) |>

  mutate(country_name = ifelse(country_name == "United States of America", "United States", country_name)) |>

  mutate(iso3c = ifelse(country_name == "France", "FRA", iso3c)) |>

  set_region(region = "country", return_region_names = return_region_names)

  

}

```

```{r}

set_region_country_rnaturalearth()

ggplot() + 

  aes(iso3c = 1, fill = NULL) + # this shouldn't be required for stamp, but is

  stamp_region() + 

  stamp_region(keep_id = c("United States", "Brazil", "Canada", 

                           "France", "South Korea", "United Kingdom",

                           "Netherlands", "Austria", "Australia", 

                           "Uganda", "Germany", "Denmark",

                           "Sweden"), fill = "midnightblue") +

  labs(title = "extenders tuning in ...")

```

```{r, fig.show='hold', out.width="33%"}

set_region_country_rnaturalearth()

countries_csv_url <- 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-11-12/countries.csv'

readr::read_csv(countries_csv_url) |>

  ggplot() + 

  aes(iso3c = alpha_3, # iso3c or country_name are required...

      fill = numeric,

      label = numeric) + 

  geom_region() +

  geom_region_text(check_overlap = T,

                   size = 2, 

                   color = "whitesmoke") + 

  labs(title = "ISO numeric country codes")

```

```{r}

last_plot() +

  aes(fill = rank(name)) + 

  labs(title = "Are numeric codes basically spaced alphabetical orderings")

last_plot() + 

  aes(fill = log(abs(rank(numeric) - rank(name)) + 1)) + 

  labs(title = "Which codes are furthest from expected if alphabetical dominates?")

last_plot() + 

  aes(fill = str_extract(name, "...") |> toupper() == alpha_3) +

  labs(title = "Which countries iso3c is exact match with first three letters?") %>% 

  labs(fill = "First three letters is ISO3")

readr::read_csv(countries_csv_url) |>

  ggplot() + 

  aes(x = rank(name), 

      y = numeric, 

      label = name) + 

  geom_point(color = "black") +

  stat_smooth(method = "lm")

```

```{r}

set_region_country_rnaturalearth()

heritage <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-02-06/heritage.csv')

heritage %>% 

  pivot_longer(cols = `2004`:`2022`, names_to = "year", values_to = "n_awards") %>% 

  ggplot() +

  aes(country_name = country) + 

  geom_region() + 

  aes(fill = n_awards) + 

  facet_wrap(~year)

```

```{r}

democracy_data <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-11-05/democracy_data.csv')

set_region_country_rnaturalearth()

democracy_data %>% 

  filter(year == 2020) %>% 

  ggplot() + 

  aes(iso3c = country_code) +

  geom_region() + 

  aes(fill = regime_category)

last_plot() + 

  aes(fill = NULL) + 

  geom_region(fill = "darkred", 

              data = . %>% filter(regime_category %>% stringr::str_detect("dictatorship"))) + 

  labs(title = "Countries classified as dictatorships in 2020 in 2024-11-05/democracy_data.csv")

```

```{r}

set_region_country_rnaturalearth()

worlds_fairs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-08-13/worlds_fairs.csv')

worlds_fairs %>%

  count(country) %>% 

  ggplot() + 

  aes(country_name = country, wt = n) + 

  stamp_region(drop_id = "Antarctica") +

  geom_region() +

  labs(title = "number of World Fairs, worlds_fairs.csv")

```

```{r}

set_region_country_rnaturalearth()

stackoverflow_survey_single_response <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-09-03/stackoverflow_survey_single_response.csv')

stackoverflow_survey_single_response %>%

  mutate(iso3c = 

           countrycode::countrycode(country, "country.name", "iso3c")) %>% 

  count(iso3c, r_used) %>% 

  filter(!is.na(r_used)) %>% 

  ggplot() + 

  aes(iso3c = iso3c, fill = n) + 

  stamp_region(drop_id = "Antarctica") +

  geom_region() + 

  scale_fill_viridis_c(transform = "log", 

                       breaks = 10^(0:4)) + 

  facet_wrap(~ ind2cat::ind_recode(r_used, 

                                   cat_true = "Has used R", 

                                   cat_false = "Hasn't used R"),

             ncol = 1) + 

  geom_region_text(aes(label = n), 

                   color = "grey",

                   check_overlap = T,

                   size = 2)

library(countrycode)

stackoverflow_survey_single_response %>%

  mutate(iso3c = countrycode(country, "country.name", "iso3c")) %>% 

  filter(!is.na(r_used)) %>% 

  summarise(r_percent = 100 * sum(r_used)/n(), 

            total_responding = n(),

            .by = iso3c) %>% 

  filter(total_responding > 10) %>% 

  ggplot() + 

  aes(fill = r_percent, iso3c = iso3c) + 

  stamp_region() +

  geom_region() + 

  scale_fill_viridis_c()

stackoverflow_survey_single_response %>% 

  mutate(iso3c = countrycode(country, "country.name", "iso3c")) %>% 

  filter(!is.na(r_used)) %>% 

  filter(n() > 10, .by = iso3c) %>% 

  ggplot() + 

  aes(iso3c = iso3c, fill = r_used) + 

  stamp_region() +

  geom_region(fun = mean) + 

  scale_fill_viridis_c()

layer_data(i = 2) |> head()

```

```{r}

set_region_country_rnaturalearth()  

eclipse_total_2024 <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-09/eclipse_total_2024.csv')

eclipse_total_2024 %>% 

  ggplot() + 

  aes(country_name = 1) + 

  stamp_region(keep_id = "United States") +

  geom_point(aes(y = lat, x = lon, color = eclipse_1))

```

```{r}

set_region_country_rnaturalearth()

outer_space_objects <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-23/outer_space_objects.csv')

outer_space_objects |>

  summarise(num_objects = sum(num_objects), .by = Code) |>

  ggplot() + 

  aes(iso3c = Code, 

      fill = num_objects) +

  stamp_region() +

  geom_region() +

  scale_fill_viridis_c(transform = "log",

                       breaks = 10^(0:4)) + 

  labs(title = "Number of outerspace objects in outer_space_objects.csv")

outer_space_objects |>

  ggplot() + 

  aes(iso3c = Code, fill = num_objects) +

  stamp_region() +

  geom_region() +

  scale_fill_viridis_c(transform = "log",

                       breaks = 10^(0:4)) + 

  labs(title = "Number of outerspace objects in outer_space_objects.csv")

layer_data(i = 2) |> head()

```

```{r}

set_region_country_rnaturalearth()

 

wwbi_data <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_data.csv')

wwbi_series <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_series.csv')

wwbi_data %>% 

  filter(year == 2007, indicator_code == "BI.WAG.PRVS.FM.SM") %>% 

  ggplot() + 

  aes(iso3c = country_code, fill = value) + 

  stamp_region() +

  geom_region() + 

  labs(title = wwbi_series %>% 

         filter(indicator_code =="BI.WAG.PRVS.FM.SM") %>%

         pull(indicator_name)) +

  scale_fill_viridis_c(option = "cividis") 

wwbi_data %>% 

  filter(year == 2007, indicator_code == "BI.WAG.PRVS.FM.SM") %>% 

  filter(value < 3) %>% 

  ggplot() + 

  aes(iso3c = country_code, fill = value) + 

  stamp_region() +

  geom_region() + 

  labs(title = wwbi_series %>% 

         filter(indicator_code =="BI.WAG.PRVS.FM.SM") %>%

         pull(indicator_name)) +

  scale_fill_viridis_c(option = "magma") 

```

```{r}

set_region_country_rnaturalearth()

   

wwbi_country <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_country.csv')

wwbi_country %>% 

  ggplot() + 

  aes(iso3c = country_code, fill = region) + 

  geom_region()

last_plot() + 

  aes(fill = income_group)

```

```{r}

set_region_country_rnaturalearth()

 

cheeses <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-06-04/cheeses.csv')

cheeses %>% 

  filter(!str_detect(milk, ",")) %>% 

  distinct(country, milk) %>% 

  ggplot() + 

  aes(country_name = country) + 

  stamp_region() + 

  geom_region(fill = "goldenrod3") +

  facet_wrap(~milk) + 

  labs(title = "Cheese Origins! Country and animal in cheeses.csv")

  

```

  

```{r}

set_region_country_rnaturalearth()

tidyr::world_bank_pop %>% 

  filter(indicator == "SP.POP.GROW") %>% 

  mutate(pop_growth = `2000`) %>% 

  ggplot() + 

  aes(iso3c = country) + 

  geom_region() + 

  aes(fill = pop_growth) + 

  scale_fill_viridis_c() + 

  labs(title = "Population Growth, 2000") +

  geom_region_outline(data = . %>% filter(pop_growth<0), 

                      color = "red")

```

```{r}

set_region_country_rnaturalearth(scale = "large")  

orcas <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-10-15/orcas.csv')

orcas |>

  ggplot() + 

  geom_point(aes(y = begin_latitude,

                 x = begin_longitude),

             size = 2, pch = 21,

             fill = "white",

             color = "black"

             ) +

  stamp_region_outline(keep_id = c("United States", "Canada")) +

  coord_sf(crs = "NAD83", 

           xlim = c(min(orcas$begin_longitude, na.rm = T), 

                     max(orcas$begin_longitude, na.rm = T)), 

           ylim = c(min(orcas$begin_latitude, na.rm = T), 

                     max(orcas$begin_latitude, na.rm = T))) +

  labs(title = "Orcas begin latitude and longitude in orcas.csv ")

```

```{r}                

set_region_country_rnaturalearth()  

cia_factbook <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-10-22/cia_factbook.csv')  

  

cia_factbook %>% 

  ggplot() + 

  aes(country_name = country) + 

  stamp_region(drop_id = "Antarctica") +

  geom_region() + 

  aes(fill = internet_users/population)

last_plot() + 

  geom_region_text(data = . %>% filter(area > 500000),

                   check_overlap = T,

                   size = 2,

                   color = "whitesmoke") + 

  aes(label = round(100*internet_users/population)|>paste0("%")) + 

  labs(title = "Internet usership") + 

  guides(fill = "none")

```

```{r}

set_region_country_rnaturalearth()  

nato_names <- c("Albania", "Belgium", "Bulgaria", "Canada", "Croatia", "Czech Republic", "Denmark", "Estonia", "France", "Germany", "Greece", "Hungary",  

                "Iceland", "Italy", "Latvia", "Lithuania", "Luxembourg", "Montenegro", "Netherlands", "Norway", "Poland", "Portugal", "Romania", "Slovakia", "Slovenia", "Spain", "Turkey", "United Kingdom", "United States")  

library(gapminder)  

gapminder %>%  

  filter(year == 2002) %>%  

  rename(name = country) %>%   

  ggplot() +  

  aes(country_name = name) +

  geom_region(keep_id = nato_names) + 

  aes(fill = gdpPercap)

```

```{r}

set_region_country_rnaturalearth()  

country_results_df <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-09-24/country_results_df.csv')

country_results_df %>% 

  filter(year == 2024) %>% 

  mutate(country = ifelse(country == "People's Republic of China", "China", country)) %>% 

  mutate(points = p1+p2+p3+p4+p5+p6) %>% 

  ggplot() + 

  aes(country_name = country) + 

  stamp_region(drop_id = "Antarctica") + 

  geom_region() + 

  aes(fill = points) + 

  aes(label = paste0("#", rank(-points),"\n",

                     country, "\n", points,"\n")) + 

  geom_region_text(size = 1.5,

                   lineheight = .7,

                   check_overlap = T) + 

  labs(title = "International Math Olympiad points earned in country_results_df.csv")

last_plot() + 

  aes(fill = points/team_size_all) + 

  aes(label = round((points)/team_size_all, 1))

  

ggplot2::theme_set

```  

## Canadian provinces

https://mountainmath.github.io/cancensus/index.html

```{r}

nhl_player_births <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-01-09/nhl_player_births.csv')

head(nhl_player_births) |> names()

# library(cancensus)

# provinces_data <- get_statcan_geographies(

#   census_year = 2021,

#   level = "PR",

#   type = "cartographic",

#   cache_path = NULL,

#   timeout = 1000,

#   refresh = FALSE,

#   quiet = FALSE

# )

set_region_province_canada_rnaturalearth <- function(){

  

  provinces_data <- rnaturalearth::ne_states(country = "Canada", returnclass = "sf")

  provinces_data %>% 

    select(prov_name = name) %>% 

    mutate(prov_name = ifelse(prov_name == "Québec", "Quebec", prov_name)) %>% 

    mutate(prov_name = ifelse(prov_name == "Yukon", "Yukon Territory", prov_name)) ->

  ref_data

  

  ref_data %>% 

    options(sf2stat.ref_data = ., 

          sf2stat.required_aes = "prov_name")

  

  message("Region has been set to canadian provinces\n

           required aes 'prov_name'")

  

  ref_data %>% sf::st_drop_geometry() %>% .[,1]

  

}

# provinces_data <- get_statcan_geographies(

#   census_year = 2021,

#   level = "PR",

#   type = "cartographic",

#   cache_path = NULL,

#   timeout = 1000,

#   refresh = FALSE,

#   quiet = FALSE

# )

# provinces_data <- rnaturalearth::ne_states(country = "Canada", returnclass = "sf")

set_region_province_canada_rnaturalearth()

library(ggplot2)

nhl_player_births |>

  filter(birth_country == "CAN") |>

  count(birth_state_province) |> 

  ggplot() +

  aes(prov_name = birth_state_province) + 

  geom_region() +

  aes(fill = n) +

  geom_region_text(aes(label = paste0(str_wrap(birth_state_province, 6),"\n", n)),

                   size = 2,

                   lineheight = .8)

  

```

## Netherlands province

```{r}

library(tmap)

data("NLD_prov")

data("NLD_muni")

set_region_province_netherlands_tmap <- function(){

NLD_prov %>% 

  select(prov_name = name, prov_code = code) %>% 

  options(sf2stat.ref_data = ., 

          sf2stat.required_aes = "prov_code|prov_name")

  

  message("Region has been set to netherland provinces\n

          required aes 'prov_code|prov_name")

}

set_subregion_municipality_netherlands_tmap <- function(){

NLD_muni %>% 

  select(muni_name = name, 

         muni_code = code, 

         prov_name = province

         ) %>% 

  options(sf2stat.ref_data_subregion = ., 

          sf2stat.required_aes_subregion = "muni_code|muni_name")

  

  message("Subregion has been set to netherland municipality\n

          required aes 'muni_code|muni_name")

}

set_region_province_netherlands_tmap()

set_subregion_municipality_netherlands_tmap()

NLD_prov %>% 

  sf::st_drop_geometry() %>% 

  ggplot() + 

  aes(prov_code = code) + 

  geom_region() + 

  aes(fill = population/100000) + 

  geom_region_text(check_overlap = T,

                   size = 2, 

                   color = "whitesmoke")

last_plot() + 

  aes(label = round(population/100000, 3))

NLD_muni %>% 

  sf::st_drop_geometry() %>%  

  ggplot() + 

  aes(muni_code = code) + 

  geom_subregion() + 

  aes(fill = population) + 

  geom_subregion_text(check_overlap = T,

                      size = 2, 

                      color = "whitesmoke") + 

  scale_fill_viridis_c(transform = "log", breaks = c(10000, 100000, 1000000))

last_plot() + 

  stamp_region_outline(aes(prov_name = 1), linewidth = .25)

```

## US states

```{r}

set_region_state_usmapdata <- function(){

  

  required_aes <- "state_name|state_abb|fips"

  

usmapdata::us_map() |> 

  select(state_name = full, state_abb = abbr, fips,

         geometry = geom) |>

  options(sf2stat.ref_data = _,

          sf2stat.required_aes = "state_name|state_abb|fips")

  

  message("required aes are 'state_name|state_abb|fips'")

  

}

set_region_state_usmapdata()

USArrests  %>% 

  rownames_to_column("state") |>

  ggplot() + 

  aes(state_name = state) + 

  geom_region(alpha = .75) + 

  aes(fill = UrbanPop) + 

  scale_fill_viridis_c()

set_region_state_usmapdata()

nhl_player_births <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-01-09/nhl_player_births.csv')

nhl_player_births |>

  filter(birth_country == "USA") |>

  count(birth_state_province) |>

  ggplot() + 

  aes(state_name = birth_state_province, fill = n) + 

  stamp_region(fill = "darkgrey") +

  geom_region() + 

  geom_region_text(aes(label = n), 

                   color = "whitesmoke",

                   size = 3) + 

  labs(title = "NHL players from US states in nhl_player_births.csv")

set_region_state_usmapdata()

pride_index <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-06-11/pride_index.csv')

pride_index %>% 

  mutate(state_abb = campus_location %>% 

           str_extract("..$")) %>%

  summarise(average_rating = mean(rating), 

            .by = state_abb) %>% 

  ggplot() + 

  aes(state_abb = state_abb, fill = average_rating) + 

  stamp_region() +

  geom_region() + 

  scale_fill_viridis_c() + 

  labs(title = "Mean pride index rating for colleges in pride_index.csv")

  

set_region_state_usmapdata()

eclipse_total_2024 <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-09/eclipse_total_2024.csv')

eclipse_total_2024 %>% 

  distinct(state) %>%  

  ggplot() + 

  aes(state_abb = state) + 

  stamp_region() +

  geom_region(data = . %>% distinct(state),

              fill = "midnightblue") + 

  labs(title = "'eclipse_1' US state locations eclipse_total_2024.csv")

  

```

## US counties

```{r}

set_region_county_usmapdata <- function(){

  

  required_aes <- "state_name|state_abb|fips|county_name"

  

usmapdata::us_map("county" ) |> 

  select(county_name = county,

         state_name = full, 

         state_abb = abbr, 

         fips,     

         geometry = geom) |>

  mutate(county_name = str_remove(county_name, " County| Census Area| Municipality| Borough")) |> 

  options(sf2stat.ref_data = _,

          sf2stat.required_aes = "state_name|state_abb|fips")

  

  message("required aes are 'state_name|state_abb|fips|county_name'")

  

}

set_region_county_usmapdata()

ggplot() + 

  aes(fips = 1) + 

  stamp_region()

```

```{r}

set_region_county_usmapdata()

polling_places <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-01-16/polling_places.csv')

polling_places |>

  filter(year(election_date) == 2020) %>% 

  mutate(county_name = 

           stringr::str_to_title(county_name)) %>% 

  count(state, county_name) %>% 

  ggplot() + 

  aes(state_abb = state, 

      county_name = county_name,

      fill = n) + 

  stamp_region() + 

  geom_region() +

  labs(title = "Number of polling places by county, 2020 in polling_places.csv") +

  scale_fill_viridis_c(transform = "log", 

                       option = "magma",

                       breaks = c(1, 10, 100, 1000, 10000)) 

set_region_state_usmapdata()

polling_places <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-01-16/polling_places.csv')

polling_places |>

  filter(year(election_date) == 2020) %>% 

  count(state) %>%  

  ggplot() + 

  aes(state_abb = state, fill = n) + 

  stamp_region() + 

  geom_region() + 

  labs(title = "Number of polling places, 2020")

```

## aseg brain segments

```{r}

set_region_aseg_ggseg <- function(){

  

  ggseg::aseg$data %>% 

  # filter(!is.na(label)) %>% 

  select(region) %>% 

  options(sf2stat.ref_data = ., 

          sf2stat.required_aes = "region")

  

}

set_region_aseg_ggseg()

ggplot() + 

  aes(region = 1) +

  stamp_region() + 

  stamp_region(keep_id = "amygdala", 

               fill = "orange") + 

  stamp_region(keep_id = "hippocampus",

               fill = "cadetblue")

set_region_aseg_ggseg()

ggplot() + 

  aes(region = 1) + 

  stamp_region() + 

  stamp_region(keep_id = c("amygdala", "hippocampus"),

               aes(fill = after_stat(region))) + 

  theme(legend.position = "top",

        legend.justification = "left") + 

  labs(fill = NULL)

  

```

## German voting districts

```{r, eval = F}

download.file("https://www.bundeswahlleiter.de/dam/jcr/f92e42fa-44f1-47e5-b775-924926b34268/btw17_geometrie_wahlkreise_geo_shp.zip", "btw17_geometrie_wahlkreise_geo_shp.zip")

unzip("btw17_geometrie_wahlkreise_geo_shp.zip")

```

```{r}

set_region_wahlkreis17_germany_bundeswahlleiter_de <- function(){

  

  wahlkreis_17 <- sf::st_read("Geometrie_Wahlkreise_19DBT_geo.shp")

  

  

  wahlkreis_17 %>% 

  # filter(!is.na(label)) %>% 

  select(wahlkreis_num = WKR_NR,

         land_num = LAND_NR,

         land_name = LAND_NAME,

         wahlkreis_name = WKR_NAme) %>% 

  options(sf2stat.ref_data = ., 

          sf2stat.required_aes =

            "wahlkreis_num|land_num|land_name|wahlkreis_name")

  

}

# wahlkreis_17

```

# sf2stat Proposed usage

https://twitter.com/EmilyRiederer/status/1816820773581127781

```{r, fig.height=2}

# given some flatfile data of interest

read.csv("nc-midterms.csv") |> head()

# and being aware of geographic data with geometry shape column

nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))

# select relevant id columns (this will keep geometry column)

nc_ref <- nc |>

  select(county_name = NAME, fips = FIPS)

# do this routine, change out the ref data, required_aes and 'county' in convenience funtion names.

stat_county <- function(...){stat_region(ref_data = nc_ref, required_aes = "county_name|fips", id_index = 1, ...)}  # uses GeomSf as default

GeomOutline <- ggproto("GeomOutline", GeomSf,

                       default_aes = aes(!!!modifyList(GeomSf$default_aes,

                                                       aes(fill = NA, 

                                                           color = "black"))))

geom_county_sf <- function(...){stat_county(geom = GeomSf,...)}

geom_county <- geom_county_sf   # convenience short name

geom_county_outline <- function(...){stat_county(geom = GeomOutline, ...)}

geom_county_label <- function(...){stat_county(geom = GeomLabel,...)}

geom_county_text <- function(...){stat_county(geom = GeomText, ...)}

stamp_county_sf <- function(...){geom_county_sf(stamp = T, ...)}

stamp_county <- stamp_county_sf

stamp_county_outline <- function(...){geom_county_outline(stamp = T, ...)}

stamp_county_label <- function(...){geom_county_label(stamp = T, ...)}

stamp_county_text <- function(...){geom_county_text(stamp = T, ...)}

```

A first NC map shows when we map desc_county to county name. 

```{r}

read.csv("nc-midterms.csv") |>

  ggplot() + 

  aes(county_name = str_to_title(desc_county)) + 

  geom_county()

```

We see that there are actually undiscovered counties, as exact name matching can be a little dicy.  Using fips which would probably perform better, which is possible with any plot data with an input fips column (I decided to skip adding fips even though I had a cross-walk)

We can do the following 'stamp' convenience layer to get the full map. I think of this as an annotation layer - it doesn't refer to global data, but 'brings its own data'. annotate_county is just too long.  

```{r}

read.csv("nc-midterms.csv") |>

  ggplot() + 

  aes(county_name = str_to_title(desc_county)) + 

  stamp_county(fill = 'darkgrey')

```

Then we use geom_county(), which reflects your data and the success of the underlying join process.  

```{r}

last_plot() + 

  geom_county()

```

Then look at population choropleth (fill = n) and highlight Mecklenburg with convenience annotation layer 'stamp_county'

```{r}

options(scipen = 10)

last_plot() + 

  aes(fill = n/100000)

```

highlight at county of interest...

```{r}

last_plot() + 

  stamp_county_outline(keep_id = "Mecklenburg", 

                       color = "orange",

                       linewidth = 1)

```

We can add a text layer defaults to ref_data column 1 (id_index setting)...

```{r}

last_plot() +

   geom_county_text(color = "white", 

                    check_overlap = T, 

                    size = 2)

```

We can look at another variable...

```{r}

last_plot() + 

   aes(fill = cd_party) 

```

And another...

```{r}

last_plot() +

  aes(fill = ind_vote)

```

And look at some values for that variable

```{r}

last_plot() +

  aes(label = round(ind_vote, 2))

```

```{r}

knitr::knit_exit()

```

```{r}

sf_df_add_xy_center_coords <- function(sf_df){

sf_df |>

    dplyr::pull(geometry) |>

    sf::st_zm() |>

    sf::st_point_on_surface() ->

  points_sf

the_coords <- do.call(rbind, sf::st_geometry(points_sf)) |>

  tibble::as_tibble() |> setNames(c("x","y"))

cbind(sf_df, the_coords)

}

```

```{r}

nc |> sf_df_add_xy_center_coords()

```

## `sf_df_return_bbox_df()`

Second we have a function that's going to return bounding boxes as a dataframe.  For our reference data we need these xmin, xmax variables for each row in our data.

```{r sf_df_return_bbox_df}

sf_df_return_bbox_df <- function(sf_df){

  

  bb <- sf::st_bbox(sf_df)

  data.frame(xmin = bb[1], ymin = bb[2],

             xmax = bb[3], ymax = bb[4])

}

```

```{r}

nc[10,] |> sf_df_return_bbox_df()

```

## `sf_df_prep_for_stat()`

Finally we bundle this into the user-facing function that will take an sf dataframe and add required columns for display in ggplot2 sf layer.

```{r sf_df_prep_for_stat}

sf_df_prep_for_stat <- function(sf_df, id_col_name = NULL){

  

  sf_df |>

    # using purrr allows us to get bb for each row

    dplyr::mutate(bb =

                    purrr::map(geometry,

                               sf_df_return_bbox_df)) |>

    tidyr::unnest(bb) |>

    data.frame() |>

    sf_df_add_xy_center_coords() ->

  sf_df_w_bb_and_centers

  # use first column as keep/drop column unless otherwise specified

  if(is.null(id_col_name)){id_col_name <- 1} 

  

  sf_df_w_bb_and_centers$id_col <- sf_df_w_bb_and_centers[,id_col_name]

  return(sf_df_w_bb_and_centers)

  

  }

```

##

----

```{r}

knitr::opts_chunk$get(eval = T)

```

## Fully worked example: How you'd use sf2stat to build functionality with scope, region type, and location name 

Let's see how we might recreate the functionality in the ggnorthcarolina package using some templates in this readme.

In the example, the scope of the package is 'northcarolina'.  The region of interest is 'county', and the location names that we are using are the county names.

### Step 00. prep reference data

```{r, eval = F}

usethis::use_data_raw()

```

## 

```{r nc_geo_reference_prep, eval = T}

nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))

geo_reference_northcarolina_county <- nc |>

  dplyr::select(county_name = NAME, fips = FIPS) |>

  sf_df_prep_for_stat(id_col_name = "county_name")

```

```{r, eval = F}

usethis::use_data(geo_reference_northcarolina_county)

```

```{r stat_region_template, eval = F}

compute_panel_scope_region <- function(data, scales, keep_id = NULL, 

                                       drop_id = NULL, stamp = FALSE){

  

  if(!stamp){data <- dplyr::inner_join(data, geo_reference_scope_region)}

  if( stamp){data <- geo_reference_scope_region }

  

  if(!is.null(keep_id)){ data <- dplyr::filter(data, id_col %in% keep_id) }

  if(!is.null(drop_id)){ data <- dplyr::filter(data, !(id_col %in% drop_id)) }

  

  data

  

}

# step 2

StatSfscoperegion <- ggplot2::ggproto(`_class` = "StatSfscoperegion",

                                `_inherit` = ggplot2::Stat,

                                # required_aes = c("fips|county_name"),

                                compute_panel = compute_panel_scope_region,

                               default_aes = ggplot2::aes(label = after_stat(id_col)))

stat_region <- function(

      mapping = NULL,

      data = NULL,

      geom = ggplot2::GeomSf,

      position = "identity",

      na.rm = FALSE,

      show.legend = NA,

      inherit.aes = TRUE,

      crs = "NAD27", # "NAD27", 5070, "WGS84", "NAD83", 4326 , 3857

      ...) {

  c(ggplot2::layer_sf(

              stat = StatSfscoperegion,  # proto object from step 2

              geom = geom,  # inherit other behavior

              data = data,

              mapping = mapping,

              position = position,

              show.legend = show.legend,

              inherit.aes = inherit.aes,

              params = rlang::list2(na.rm = na.rm, ...)

              ),

              

              coord_sf(crs = crs,

                       default_crs = sf::st_crs(crs),

                       datum = crs,

                       default = TRUE)

     )

  }

```

```{r}

readme2pkg::chunk_variants_to_dir(chunk_name = "stat_region_template",

                                  file_name = "stat_county",

                                  replace1 = "scope",

                                  replacements1 = "northcarolina",

                                  replace2 = "region",

                                  replacements2 = "county")

```

# test it out

```{r}

source("./R/stat_county")

library(ggplot2)

nc |>

  sf::st_drop_geometry() |>

  ggplot() +

  aes(fips = FIPS) +

  stat_county() + 

  aes(fill = BIR79)

```

### Make derivitive functions, aliases

```{r geom_region_template, eval = F}

geom_region <- stat_region

geom_region_label <- function(...){stat_region(geom = "text",...)}

stamp_region <- function(...){

  stat_region(stamp = T, 

              data = mtcars,

              aes(fill = NULL, color = NULL, label = NULL, 

                  fips = NULL, region_name = NULL), 

              ...)}

stamp_region_label <- function(...){

  stat_region(stamp = T, 

              geom = "text", 

              data = mtcars, 

              aes(fill = NULL, color = NULL,

                  fips = NULL, region_name = NULL), 

              ...)}

```

```{r}

readme2pkg::chunk_variants_to_dir(chunk_name = "geom_region_template",

                                  file_name = "geom_county",

                                  replace1 = "region",

                                  replacements1 = "county")

```

# try those out

```{r}

source("./R/geom_county")

nc |>

  sf::st_drop_geometry() |>

  ggplot() +

  aes(fips = FIPS) +

  geom_county() + 

  geom_county_label(check_overlap = T,

                    color = "grey85") +

  aes(fill = BIR79) 

last_plot() + 

  stamp_county() + 

  stamp_county_label()

ggplot() + 

  stamp_county()

last_plot() + 

  stamp_county_label(check_overlap = T)

last_plot() + 

  stamp_county(keep_id = "Wake", fill = "darkred")

```

# Wanting even more?

## Stamps for each polygon?

```{r stamp_region_location}

#' Title

#'

#' @param ... 

#'

#' @return

#' @export

#'

#' @examples

stamp_region_location <- function(...){stamp_region(keep_id = 'Location', ...)}

#' Title

#'

#' @param ... 

#'

#' @return

#' @export

#'

#' @examples

stamp_region_label_location <- function(...){stamp_region_label(keep_id = 'Location', ...)}

```

```{r eval=F}

ids <- geo_reference_northcarolina_county$county_name

ids_snake <- tolower(geo_reference_northcarolina_county$county_name) |> 

  stringr::str_replace_all(" ", "_")

readme2pkg::chunk_variants_to_dir(chunk_name = "stamp_region_location", 

                                  file_name = "stamp_county_locations.R",

                                  replace1 = "region",

                                  replacements1 = rep("county", length(ids)),

                              replace2 = "location",

                              replacements2 = ids_snake,

                              replace3 = "Location", 

                              replacements3 = ids)

```

```{r}

source("./R/stamp_county_locations.R")

nc |>

  sf::st_drop_geometry() |>

  ggplot() +

  aes(fips = FIPS) + 

  stamp_county() + 

  stamp_county_ashe(fill = "darkred")

```

# Template functions. Some old ideas that we're moving away from.

These are more of an experiment.  The code to write a layer is multistep and verbose, so maybe providing some templates is a good idea. But maybe this isn't the right place or implementation. 

## `template_compute_panel_code()`

```{r template_compute_panel_code}

template_compute_panel_code <- function(){

  

"compute_panel_geo_XXXX <- function(data, scales, keep_id = NULL, drop_id = NULL){

  

  if(!is.null(keep_id)){ data <- filter(data, id_col %in% keep_id) }

  if(!is.null(drop_id)){ data <- filter(data, !(id_col %in% drop_id)) }

  

  if(!stamp){data <- dplyr::inner_join(data, geo_ref_XXXX)}

  if( stamp){data <- geo_ref_XXXX }

  

  data

  

}" |> cat()

  

}

```

## `template_stat_code()`

```{r template_stat_code}

template_stat_code <- function(){

  

'StatXXXXsf <- ggplot2::ggproto(`_class` = "StatXXXXsf",

                                `_inherit` = ggplot2::Stat,

                                required_aes = c("fips|county_name|XXXX"),

                                compute_panel = compute_panel_geo_XXXX,

                               default_aes = c(label = ggplot2::after_stat(id_col)))' |> cat()

}

```

## `template_layer_code()`

```{r template_layer_code}

template_layer_code <- function(){ 'stat_XXXX <- function(

      mapping = NULL,

      data = NULL,

      geom = ggplot2::GeomSf,

      position = "identity",

      na.rm = FALSE,

      show.legend = NA,

      inherit.aes = TRUE,

      crs = "NAD27", # "NAD27", 5070, "WGS84", "NAD83", 4326 , 3857

      ...) {

  c(ggplot2::layer_sf(

              stat = StatXXXX,  # proto object from step 2

              geom = geom,  # inherit other behavior

              data = data,

              mapping = mapping,

              position = position,

              show.legend = show.legend,

              inherit.aes = inherit.aes,

              params = rlang::list2(na.rm = na.rm, ...)

              ),

              

              coord_sf(crs = crs,

                       default_crs = sf::st_crs(crs),

                       datum = crs,

                       default = TRUE)

     )

  }' |> cat()

}

  

```

# Part II. Packaging and documentation  🚧 ✅ 

## Phase 1. Minimal working package

### Bit A. Created package archetecture, running `devtools::create(".")` in interactive session. 🚧 ✅

```{r, eval = F}

devtools::create(".")

```

### Bit B. Added roxygen skeleton? 🚧 ✅

Use a roxygen skeleton for auto documentation and making sure proposed functions are *exported*.  Generally, early on, I don't do much (anything) in terms of filling in the skeleton for documentation, because things may change.

### Bit C. Managed dependencies ? 🚧 ✅

Package dependencies managed, i.e. `depend::function()` in proposed functions and declared in the DESCRIPTION

```{r, eval = F}

usethis::use_package("sf")

usethis::use_package("dplyr")

usethis::use_package("tibble")

usethis::use_package("tidyr")

usethis::use_package("purrr")

```

### Bit D. Moved functions R folder? 🚧 ✅

Use new {readme2pkg} function to do this from readme...

```{r, eval = F}

readme2pkg::chunk_to_r("sf_df_return_bbox_df")

readme2pkg::chunk_to_r("sf_df_add_xy_center_coords")

readme2pkg::chunk_to_r("sf_df_prep_for_stat")

readme2pkg::chunk_to_r("template_compute_panel_code")

readme2pkg::chunk_to_r("template_stat_code")

readme2pkg::chunk_to_r("template_layer_code")

```

### Bit E. Run `devtools::check()` and addressed errors. 🚧 ✅

```{r, eval = F}

devtools::check(pkg = ".")

```

### Bit F. [Install](https://r-pkgs.org/whole-game.html#install) and restart package 🚧 ✅

```{r, eval = F}

devtools::build()

```

### Bit G. Write traditional README that uses built package (also serves as a test of build. 🚧 ✅ 

The goal of the {xxxx} package is to ...

Install package with:

```

remotes::installgithub("EvaMaeRey/readme2pkg.template")

```

Once functions are exported you can remove go to two colons, and when things are are really finalized, then go without colons (and rearrange your readme...)

```{r, eval = F}

library(sf2stat)  ##<< change to your package name here

nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))

nc |>

  dplyr::select(county_name = NAME, fips = FIPS) |>

  sf2stat::sf_df_prep_for_stat(id_col_name = "county_name") ->

nc_geo_reference

```

### Bit H. Chosen a [license](https://r-pkgs.org/license.html)? 🚧 ✅

```{r, eval = F}

usethis::use_mit_license()

```

### Bit I. Add [lifecycle badge](https://r-pkgs.org/lifecycle.html) (experimental)

```{r, eval = F}

usethis::use_lifecycle_badge("experimental")

```

## Phase 2: Listen & iterate 🚧 ✅

Try to get feedback from experts on API, implementation, default decisions.  Is there already work that solves this problem? 

## Phase 3: Let things settle

### Bit A. Settle on [examples](https://r-pkgs.org/man.html#sec-man-examples).  Put them in the roxygen skeleton and readme. 🚧 ✅

### Bit B. Written formal [tests](https://r-pkgs.org/testing-basics.html) of functions and save to test that folders 🚧 ✅

That would look like this... 

```{r test_calc_times_two_works, eval = F}

library(testthat)

test_that("calc times 2 works", {

  expect_equal(times_two(4), 8)

  expect_equal(times_two(5), 10)

  

})

```

```{r, eval = F}

readme2pkg::chunk_to_tests_testthat("test_calc_times_two_works")

```

### Bit C. Added a description and author information in the DESCRIPTION file 🚧 ✅

### Bit D. Addressed *all* notes, warnings and errors. 🚧 ✅

## Phase 4. Promote to wider audience...

### Bit A. Package website built? 🚧 ✅

### Bit B. Package website deployed? 🚧 ✅

## Phase 5: Harden/commit

### Submit to CRAN/RUniverse? 🚧 ✅

# Appendix: Reports, Environment

## Edit Description file

```{r, eval = F}

readLines("DESCRIPTION")

```

## Environment

Here I just want to print the packages and the versions

```{r}

all <- sessionInfo() |> print() |> capture.output()

all[11:17]

```

## `devtools::check()` report

```{r, error = T, results="hide", warning=F, eval=F}

devtools::check(pkg = ".")

```