Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/evamaerey/ggtextextra

get colors from existing plot and apply to text in title via ggtext functionality
https://github.com/evamaerey/ggtextextra
Last synced: 12 days ago
JSON representation
get colors from existing plot and apply to text in title via ggtext functionality
Host: GitHub
URL: https://github.com/evamaerey/ggtextextra
Owner: EvaMaeRey
Created: 2024-04-24T21:09:13.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-05-02T19:18:26.000Z (10 months ago)
Last Synced: 2024-12-14T20:24:28.608Z (2 months ago)
Language: R
Homepage: https://evamaerey.github.io/ggtextExtra/
Size: 1.38 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project

README

        ---

output: 

  github_document:

    toc: TRUE

    toc_depth: 2

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  eval = T,

  error = T

)

```

> When participants were shown visualizations with titles during encoding, the titles were fixated 59% of the time. Correspondingly, during the recall phase of the experiment, titles were the element most likely to be described if present (see Fig. 8). When presented visualizations with titles, 46% of the time the participants described the contents, or rewording, of the title. - Beyond Memorability: Visualization Recognition and Recall

> Across all textual elements, the title is among the most important. A good or a bad title can sometimes make all of the difference between a visualization that is recalled correctly from one that is not - Beyond Memorability: Visualization Recognition and Recall https://vcg.seas.harvard.edu/files/pfister/files/infovis_submission251-camera.pdf

> https://youtu.be/L5_4kuoiiKU?si=ViZVNn981ulbB09b&t=352

# Part 000. Abstract

ggplot2 users sometimes forego a color or fill legend, instead coloring the text of categories in a title or annotation as they make mention of them using ggtext.

But the process for doing this, I believe, can be somewhat tedious and error prone, because you have to type your own html color tags. The color tags also might make source composed title text less readable.

I admire these plots, but maybe have made them once because they require some stamina and focus to choreograph the colored text coordination.

Below we ‘crack open ggplot2 internals’ to try to streamline this process. Our objective is that text coloring automatically matches whatever ggplot2 is doing with color and fill scales.

# Part 00. Proposal

This repo proposes the {ggtextExtra} package. 🦄 

The goal of {ggtextExtra} is to make matching colors text written via ggtext moves to ggplot2 layers (geoms that are filled or colored).

There's just one function that's worked out so far (matching fill color for title), but I think it's a useful one and could be easily extended to other areas (geom_textbox, color).

Without the package, we can do something like the following 🏋:

```{r}

library(tidyverse)

palmerpenguins::penguins  |>

 ggplot(aes(flipper_length_mm, body_mass_g,

             fill = species)) +

  geom_point(size = 7, alpha = .8, shape = 21, color = "white") +

  scale_fill_manual(values = c("plum4", "seagreen", "cadetblue")) +

  labs(title = "Gentoo are our big penguins") +

  theme(plot.title = ggtext::element_markdown())

```

An approach that automates the coordination a *bit* more can be found here. https://albert-rapp.de/posts/16_html_css_for_r/16_html_css_for_r

```{r, eval = F}

library(tidyverse)

colors <- thematic::okabe_ito(2)

names(colors) <- c('male', 'female')

palmerpenguins::penguins |> 

  filter(!is.na(sex)) |> 

  ggplot(aes(bill_length_mm, flipper_length_mm, fill = sex)) +

  geom_point(size = 3, alpha = 0.75, col = 'black', shape = 21) +

  labs(

    x = 'Bill length (in mm)',

    y = 'Flipper length (in mm)',

    title = glue::glue('Measurements of **male** and **female** penguins')

  ) +

  scale_fill_manual(values = colors)

```

But this requires a bit more in the way of presteps and writing text in the html tags; these tags also might make the readability of the title even more challenging.

---

  

With the {ggtextExtra} package, we'll live in a different world (🦄 🦄 🦄) where the task a little more automatic: 

Proposed API will:

1. Build a ggplot in the usual way. I.e. do whatever you like with color/fill mapping to categories

2. apply a function to that automatically checks the colors used to represent categories and applies those in the title where the categories appear. (via ggtext::element_markdown).

What that would look like in code: 

```

my_title <- "The **setosa** irises has the largest average sapel widths 
and then comes **virginica** irises while
**versicolor** has the shortest sapel width"

palmerpenguins::penguins  |>

 ggplot(aes(flipper_length_mm, body_mass_g,

             fill = species)) +

  geom_point(shape = 21, 

             color = "white", size = 8, alpha = .9) +

  scale_fill_viridis_d(end = .8, begin = .2) +

  labs(title = my_title) +

  theme(plot.title = ggtext::element_markdown())

  

use_fill_scale_in_title_words(plot = last_plot()) + 

  guides(fill = "none")  

```

# Part 0. Lay out package infrastructure

```{r, eval = F}

devtools::create(".")

```

# Part 00 Explore ggplot object internals

Now let's 'crack into ggplot2 internals' to see if we can get to a match between categories and colors programmatically.

First, we'll look at the colors actually rendered in the layer data of the plot using `layer_data`. These are often (always?) stored as hex colors which will definitely work in html/markdown context, whereas you have to be a little careful with some R named colors not working at all in html.

## grab fill color and associated group with `layer_data`

First, we'll use `layer_data` to grab fill and group.

```{r}

plot <- last_plot()

fill_values_df <- layer_data(plot, i = 1) %>%  .[,c("fill", "group")] |> distinct()

fill_values_df

```

## grab the name of the variable that's mapped to fill

Then we'll grab the name of the variable that's mapped to fill. This seems a little weird, but seems to works. Open to different approaches that might be more robust!

```{r}

fill_var_name <- plot$mapping$fill |> capture.output() %>% .[2] %>% str_extract("\\^.+") %>% str_remove("\\^")

fill_var_name

```

## grab the categories in the mapped variable

Then we can grab the actual vector of data that's being represented by fill color - the `plot$data` slot. We put this in a dataframe/tibble, and then used `distinct` to get a one-to-one category-group table.

```{r}

fill_var_df <- plot$data[fill_var_name] %>% distinct()

fill_var_df

names(fill_var_df) <- "fill_var"

fill_var_df <- mutate(fill_var_df, group = as.numeric(fill_var))

fill_var_df

```

## create a color-category crosswalk

Then we join our colors and categories by group, and have a color-category one-to-one table. And then we can prepare an html statement that will make the category colorful when rendered by ggtext functionality.

```{r}

left_join(fill_values_df, fill_var_df, by = join_by(group)) %>% 

  mutate(html_replacements = 

           paste0("", fill_var, "") )

```

# Part I. Work out functionality  🚧 ✅ 

The above looks pretty promising so let's put this in a function. The function will let us go straight from a *ggplot2 plot object* to a *category-color-htmlreplcement data frame*.

```{r grab_fill_info}

grab_fill_info <- function(plot = last_plot(), i = NULL){

  

  if(is.null(i)){i <- length(plot$layers)}

  

fill_values_df <- ggplot2::layer_data(plot, i = i) %>%  

  .[,c("fill", "group")] |> 

  dplyr::distinct()

fill_var_name <- plot$mapping$fill |> 

  capture.output() %>% .[2] %>% 

  stringr::str_extract("\\^.+") %>% 

  stringr::str_remove("\\^")

fill_var_df <- plot$data[,fill_var_name] |> 

  dplyr::distinct()

names(fill_var_df) <- "fill_var"

fill_var_df <- fill_var_df |> mutate(group = as.numeric(as.factor(fill_var)))

dplyr::left_join(fill_values_df, 

                 fill_var_df, 

                 by = dplyr::join_by(group)) %>% 

  dplyr::mutate(html_replacements = 

                  paste0("", 

                         .data$fill_var, 

                         "") ) %>% 

  ggplot2::remove_missing()  # where category is NA

  

}

```

We can test this out with the plot we saved before:

```{r}

grab_fill_info(plot = plot)

```

## A first cut at find-replace to add html color tags automatically to a string

And then we'll use the data frame output, to make replacements in a string, adding the html tags. We'll just get it done with a for loop.  One danger, that I'm leaving for later, is that you might have a categories like 'anana' (this means pineapple in Portuguese and maybe some other languages) and 'banana' (this means banana in Portuguese and maybe some other languages).  In this case, you'll have a bad result given the current implementation. (Anana has an acent on the final syllable in Portuguese you might actually be saved!)  male/female is the same problem - but not as nice of a tangent.

```{r}

fill_df <- grab_fill_info(plot)

for(i in 1:nrow(fill_df)){

  start <- "Gentoo have the longest average flippers"

  

  start <- start |> stringr::str_replace(fill_df$fill_var[i] %>% as.character(), fill_df$html_replacements[i])

start %>% print()

  

}

```

Let's put the for loop in a function:

```{r auto_color_html}

auto_color_html <- function(x, fill_df ){

  

 for(i in 1:nrow(fill_df)){

  x <- x |> stringr::str_replace(fill_df$fill_var[i] %>% as.character, fill_df$html_replacements[i])

  

 }

  

  x

  

}

```

Test it out...

```{r}

auto_color_html("The Gentoo is a cool penguin", grab_fill_info(plot))

```

## Try out constituent functions w/ a fresh plot

Now let's use our functions with a fresh plot, q. 

```{r}

palmerpenguins::penguins  |>

 ggplot(aes(flipper_length_mm, body_mass_g,

             fill = species)) +

  geom_point(shape = 21, color = "white", size = 8) +

  scale_fill_viridis_d(end = .8, begin = .2) +

  labs(title = "The **Adelie** iris has the smallest average sapel width
and the **Gentoo** irises have largest average sapel width


       while **Chinstrap** is in-between") +

  theme(plot.title = ggtext::element_markdown())

q <- last_plot()

q_fill_df <- grab_fill_info(q)

colorful_title <- "The **Adelie** penguin has the *smallest* average flipper length
and the **Gentoo** have *largest* length
 while **Chinstrap** species is in-between" |> 

        auto_color_html(q_fill_df)

q + 

  labs(title = colorful_title)  #overwriting title

```

# Further bundling

Looks good.  What if we wrap everything, and just replace the title with an html color-tagged version. 

```{r use_fill_scale_in_title_words}

use_fill_scale_in_title_words <- function(plot, i = NULL){

  

  out <- plot

  

  if(!is.null(i)){i <- length(plot$layers)}  # looks at last layer for fill colors

  

  plot_fill_df <- grab_fill_info(plot, i = i)

  

  out$labels$title <- out$labels$title |> 

        auto_color_html(plot_fill_df)

  return(out)

  

}

```

Try it out starting fresh.

```{r}

my_title <- "The **Adelie** penguin has the *smallest* average flipper length
and the **Gentoo** have *largest* length
 while **Chinstrap** species is in-between" 

palmerpenguins::penguins |>

 ggplot() +

  aes(x = flipper_length_mm, 

      y = body_mass_g,

      fill = species) +

  geom_point(shape = 21, 

             color = "white", size = 8, alpha = .9) +

  scale_fill_viridis_d(end = .8, begin = .2) +

  labs(title = my_title) +

  theme(plot.title = ggtext::element_markdown())

use_fill_scale_in_title_words(plot = last_plot()) + 

  guides(fill = "none")

```

```{r}

# Option 2: Read directly from GitHub

wwbi_data <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_data.csv')

wwbi_series <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_series.csv')

wwbi_country <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-04-30/wwbi_country.csv')

head(wwbi_data)

head(wwbi_series)

head(wwbi_country)

wwbi_series %>% 

  filter(indicator_code == "BI.EMP.PWRK.PB.HE.ZS") %>% 

  .[2] ->

y_var

wwbi_data %>% 

  filter(indicator_code == "BI.EMP.PWRK.PB.HE.ZS"	 ) %>% 

  left_join(wwbi_country) %>% 

  filter(region == "Latin America & Caribbean") %>% 

  mutate(is_select_country = short_name %in% 

                                     c("Peru","Colombia")) %>% 

  mutate(select_countries = ifelse(is_select_country,

                                   short_name, NA)) %>% 

  arrange(is_select_country) %>% 

  ggplot() + 

  labs(title = y_var %>% 

         paste("
in Latin America, highlighting **Peru** and **Colombia**")) +

  aes(x = year, y = value, fill = select_countries, color = select_countries) +

  # geom_text(aes(label = country_code)) +

  geom_line(aes(group = country_code), alpha = .5) + 

  geom_point(shape = 21, size = 6, color = "white") + 

  scale_fill_manual(values = c("darkseagreen4", "plum4"), na.value = "grey70") + 

  scale_color_manual(values = c("darkseagreen4", "plum4"), na.value = "grey70") + 

  scale_fill_viridis_d(na.value = "grey70", end = .7, begin = .1) + 

  # scale_fill_viridis_d() +  # no color..

  NULL

use_fill_scale_in_title_words(last_plot()) +

  theme(plot.title = ggtext::element_markdown()) +

  guides(fill = "none", color = "none") 

last_plot()+ 

  facet_wrap(~country_code)

```

# Issues/Notes:

- fill mapping should be globally declared

- data should be globally declared

- fill colors are taken from the last (top) layer unless i is changed.

- naive string replacement issue

# Part II. Packaging and documentation

## Phase 1. Minimal working package ✅  

To build a minimal working package, 1) we'll need to document dependencies 2) send functions to .R files in the R folder, 3) do a package check (this will A. Document our functions and B. this will help us identify problems - for example if we've failed to declare a dependency) and 4) install the package locally.  

Then we'll write up a quick advertisement for what our package is able to do in the 'traditional readme' section.  This gives us a chance to test out the package.

```{r, eval = F}

### Bit 2a: in the function(s) you wrote above make sure dependencies to functions using '::' syntax to pkg functions 

usethis::use_package("ggplot2") # Bit 2b: document dependencies, w hypothetical ggplot2

usethis::use_package("dplyr")

usethis::use_package("stringr")

usethis::use_package("tibble")

usethis::use_pipe()

```

```{r}

knitrExtra::chunk_names_get()

# Bit 3: send the code chunk with function to R folder

knitrExtra:::chunk_to_r(chunk_name = "grab_fill_info") 

knitrExtra:::chunk_to_r(chunk_name = "auto_color_html")

knitrExtra:::chunk_to_r(chunk_name = "use_fill_scale_in_title_words")

```

```{r, eval = F}

# Bit 4: document functions and check that package is minimally viable

devtools::check(pkg = ".")  

# Bit 5: install package locally

devtools::install(pkg = ".", upgrade = "never") 

```

### Bit 7. Write traditional README that uses built package (also serves as a test of build). ✅ 

The goal of the {ggtextExtra} package is to ...

Install package with:

```

remotes::install_github("EvaMaeRey/ggtextExtra")

```

Once functions are exported you can remove go to two colons, and when things are are really finalized, then go without colons (and rearrange your readme...)

```{r, eval = T}

library(ggtextExtra)

my_title <- "The **Adelie** penguin has the *smallest* average flipper length
and the **Gentoo** have *largest* length
 while **Chinstrap** species is in-between"

palmerpenguins::penguins |>

 ggplot() +

  aes(x = flipper_length_mm, 

      y = body_mass_g,

      fill = species) +

  geom_point(shape = 21, 

             color = "white", size = 8, alpha = .9) +

  scale_fill_viridis_d(end = .8, begin = .2) +

  labs(title = my_title) +

  theme(plot.title = ggtext::element_markdown())

ggtextExtra:::use_fill_scale_in_title_words(plot = last_plot()) + 

  guides(fill = "none")

```

# Bit 8: Compile readme

# Bit 9: Push to github

# Bit 10: listen and iterate

## Phase 3: Settling and testing 🚧 

### Bit A. Added a description and author information in the [DESCRIPTION file](https://r-pkgs.org/description.html) 🚧

### Bit B. Added [roxygen skeleton](https://r-pkgs.org/man.html) for exported functions. 🚧 

### Bit D. Settle on [examples](https://r-pkgs.org/man.html#sec-man-examples).  Put them in the roxygen skeleton and readme. 🚧 

### Bit C. Chosen a [license](https://r-pkgs.org/license.html)? 🚧 

```{r, eval = F}

usethis::use_mit_license()

```

### Bit D. Use life-cycle badge

```{r, eval = F}

usethis::use_lifecycle_badge("experimental") 

```

### Bit E. Written formal [tests](https://r-pkgs.org/testing-basics.html) of functions and save to test that folders 🚧 

That would look like this...

```{r test_calc_times_two_works, eval = F}

library(testthat)

test_that("calc times 2 works", {

  expect_equal(times_two(4), 8)

  expect_equal(times_two(5), 10)

  

})

```

```{r, eval = F}

knitrExtra::chunk_to_tests_testthat("test_calc_times_two_works")

```

### Bit F. Check again. Addressed notes, warnings and errors. 🚧 

```{r, eval = F}

devtools::check(pkg = ".")

```

## Phase 4. Promote to wider audience...  🚧 

### Bit A. Package website built? 🚧 

### Bit B. Package website deployed? 🚧 

## Phase 5: Harden/commit: Submit to CRAN/RUniverse 🚧 

# Appendix: Reports, Environment

## Description file complete?  🚧 

```{r, eval = F}

readLines("DESCRIPTION")

```

## Environment  🚧 

Here I just want to print the packages and the versions

```{r}

all <- sessionInfo() |> print() |> capture.output()

all[11:17]

```

## `devtools::check()` report

```{r, eval=F, error = T, results="hide", warning=F}

devtools::check(pkg = ".")

```

## Package directory file tree

```{r}

fs::dir_tree(recurse = T)

```