Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/mjskay/ggblend

Support for blend modes in ggplot2
https://github.com/mjskay/ggblend

Last synced: 3 months ago
JSON representation

Support for blend modes in ggplot2

Lists

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
fig.path = "man/figures/README-",
dev.args = list(png = list(type = "cairo")),
fig.retina = 2
)
```

# ggblend: Blending and compositing algebra for ggplot2

[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/ggblend)](https://CRAN.R-project.org/package=ggblend)
[![Codecov test coverage](https://codecov.io/gh/mjskay/ggblend/branch/main/graph/badge.svg)](https://app.codecov.io/gh/mjskay/ggblend?branch=main)
[![R-CMD-check](https://github.com/mjskay/ggblend/workflows/R-CMD-check/badge.svg)](https://github.com/mjskay/ggblend/actions)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7963886.svg)](https://doi.org/10.5281/zenodo.7963886)

*ggblend* is a small algebra of operations for blending, copying, adjusting, and
compositing layers in *ggplot2*. It allows you to easily copy and adjust the
aesthetics or parameters of an existing layer, to partition a layer into multiple
pieces for re-composition, and to combine layers (or partitions of layers) using
blend modes (like `"multiply"`, `"overlay"`, etc).

*ggblend* requires R ≥ 4.2, as blending and compositing support was added in that
version of R.

## Installation

You can install *ggblend* from CRAN as follows:

```r
install.packages("ggblend")
```

You can install the development version of *ggblend* using:

```r
remotes::install_github("mjskay/ggblend")
```

## Blending within one geometry

We'll construct a simple dataset with two semi-overlapping point clouds. We'll
have two versions of the dataset: one with all the `"a"` points listed first,
and one with all the `"b"` points listed first.

```{r data, message=FALSE, warning=FALSE}
library(ggplot2)
library(ggblend)
theme_set(ggdist::theme_ggdist() + theme(
plot.title = element_text(size = rel(1), lineheight = 1.1, face = "bold"),
plot.subtitle = element_text(face = "italic"),
panel.border = element_rect(color = "gray75", fill = NA)
))

set.seed(1234)
df_a = data.frame(x = rnorm(500, 0), y = rnorm(500, 1), set = "a")
df_b = data.frame(x = rnorm(500, 1), y = rnorm(500, 2), set = "b")

df_ab = rbind(df_a, df_b) |>
transform(order = "draw a then b")

df_ba = rbind(df_b, df_a) |>
transform(order = "draw b then a")

df = rbind(df_ab, df_ba)
```

A typical scatterplot of such data suffers from the problem that how many
points appear to be in each group depends on the drawing order (*a then b*
versus *b then a*):

```{r scatter_noblend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(title = "geom_point() without blending", subtitle = "Draw order matters.")
```

A *commutative* blend mode, like `"multiply"` or `"darken"`, is one potential
solution that does not depend on drawing order. We can apply a `blend()`
operation to geom_point()` to achieve this. There three ways to do this:

- `blend(geom_point(...), "multiply")` (normal function application)
- `geom_point(...) |> blend("multiply")` (piping)
- `geom_point(...) * blend("multiply")` (algebraic operations)

Function application and piping are equivalent. **In this case**, all three
approaches are equivalent. As we will see later, the multiplication approach
is useful when we want a shorthand for applying the same operation to multiple
layers in a list without combining those layers first (in other words,
multiplication of operations over layers is *distributive* in an algebraic
sense).

```{r scatter_blend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> blend("multiply") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(alpha = 0.5) |> blend('multiply')",
subtitle = "Draw order does not matter, but color is too dark."
)
```

Now the output is identical no matter the draw order, although the output is quite dark.

## Partitioning layers

Part of the reason the output is very dark above is that all of the points are being
multiply-blended together. When many objects (here, individual points) are multiply-blended on top of each
other, the output tends to get dark very quickly.

However, we really only need the two sets to be multiply-blended with each other.
Within each set, we can use regular alpha blending. To do that, we can partition the geometry
by `set` and then blend. Each partition will be blended normally within the set, and
then the resulting sets will be multiply-blended together just once:

```{r scatter_partition_blend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("multiply") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(alpha = 0.5) |> partition(vars(set)) |> blend('multiply')",
subtitle = "Light outside the intersection, but still dark inside the intersection."
)
```

That's getting there: points outside the intersection of the two sets look good,
but the intersection is still a bit dark.

Let's try combining two blend modes to address this: we'll use a `"lighten"`
blend mode (which is also commutative) to make the overlapping regions
lighter, and then draw the `"multiply"`-blended version on top at an `alpha`
of less than 1:

```{r scatter_lighten_multiply}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("lighten") +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("multiply", alpha = 0.5) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title =
"geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend('lighten') + \ngeom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend('multiply', alpha = 0.5)",
subtitle = 'A good compromise, but a long specification.'
) +
theme(plot.subtitle = element_text(lineheight = 1.2))
```

Now it's a little easier to see both overlap and density, and the output remains independent of draw order.

However, it is a little verbose to need to copy out a layer multiple times:

```r
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) * blend("lighten") +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) * blend("multiply", alpha = 0.5) +
```

We can simplify this is two ways: first, `partition(vars(set))` is equivalent
to setting `aes(partition = set)`, so we can move the partition specification
into the global plot aesthetics, since it is the same on every layer.

Second, operations and layers in *ggblend* act as a small algebra. Operations and sums
of operations can be multiplied by layers and lists of layers, and those
operations are distributed over the layers (This is where `*` and `|>` differ:
`|>` does not distribute operations like `blend()` over layers, which is
useful if you want to use a blend to combine multiple layers together, rather
than applying that blend to each layer individually).

Thus, we can "factor out"
`geom_point(size = 3, alpha = 0.5)` from the above expression, yielding this:

```r
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5))
```

Both expressions are equivalent. Thus we can rewrite the previous example
like so:

```{r scatter_lighten_multiply_stacked}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(aes(partition = set)) * (blend('lighten') + blend('multiply', alpha = 0.5))",
subtitle = "Two order-independent blends on one layer using the distributive law."
) +
theme(plot.subtitle = element_text(lineheight = 1.2))
```

## Blending multiple geometries

We can also blend geometries together by passing a list of geometries to `blend()`.
These lists can include already-blended geometries:

```{r scatter_blend_geom_incorrect}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
list(
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)),
geom_vline(xintercept = 0, color = "gray75", linewidth = 1.5),
geom_hline(yintercept = 0, color = "gray75", linewidth = 1.5)
) |> blend("hard.light") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "Blending multiple geometries together in a list",
subtitle = "Careful! The point layer blend is incorrect!"
)
```

Whoops!! If you look closely, the blending of the `geom_point()` layers appears to
have changed. Recall that this expression:

```r
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5))
```

Is equivalent to specifying two separate layers, one with `blend("lighten")`
and the other with `blend("multiply", alpha = 0.65))`. Thus, when you apply
`|> blend("hard.light")` to the `list()` of layers, it will use a hard light
blend mode to blend these two layers together, when previously they would be
blended using the normal (or `"over"`) blend mode.

We can gain back the original appearance by blending these two layers together
with `|> blend()` prior to applying the hard light blend:

```{r scatter_blend_geom}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
list(
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)) |> blend(),
geom_vline(xintercept = 0, color = "gray75", linewidth = 1.5),
geom_hline(yintercept = 0, color = "gray75", linewidth = 1.5)
) |> blend("hard.light") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(title = "Blending multiple geometries together")
```

## Partitioning and blending lineribbons

Another case where it's useful to have finer-grained control of blending within a given
geometry is when drawing overlapping uncertainty bands. Here, we'll show how to use `blend()` with `stat_lineribbon()`
from [ggdist](https://mjskay.github.io/ggdist/)
to create overlapping gradient ribbons depicting uncertainty.

We'll fit a model:

```{r m_mpg}
m_mpg = lm(mpg ~ hp * cyl, data = mtcars)
```

And generate some confidence distributions for the mean using [distributional](https://pkg.mitchelloharawild.com/distributional/):

```{r lineribbon}
predictions = unique(mtcars[, c("cyl", "hp")])

predictions$mu_hat = with(predict(m_mpg, newdata = predictions, se.fit = TRUE),
distributional::dist_student_t(df = df, mu = fit, sigma = se.fit)
)

predictions
```

A basic plot based on examples in `vignette("freq-uncertainty-vis", package = "ggdist")` and
`vignette("lineribbon", package = "ggdist")` may have issues when lineribbons overlap:

```{r lineribbon_noblend}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_lineribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "ggdist::stat_lineribbon()",
subtitle = "Overlapping lineribbons obscure each other.",
color = "cyl", fill = "cyl", y = "mpg"
)
```

Notice the overlap of the orange (`cyl = 6`) and purple (`cyl = 8`) lines.

If we add a `partition = cyl` aesthetic mapping, we can blend the geometries
for the different levels of `cyl` together with a `blend()` call around
`ggdist::stat_lineribbon()`.

There are many ways we could add the partition to the plot:

1. Add `partition = cyl` to the existing `aes(...)` call. However, this
leaves the partitioning information far from the call to `blend()`, so the
relationship between them is less clear.
2. Add `aes(partition = cyl)` to the `stat_lineribbon(...)` call. This is
a more localized change (better!), but will raise a warning if `stat_lineribbon()`
itself does not recognized the `partition` aesthetic.
3. Add `|> adjust(aes(partition = cyl))` after `stat_lineribbon(...)` to
add the `partition` aesthetic to it (this will bypass the warning).
4. Add `|> partition(vars(cyl))` after `stat_lineribbon(...)` to add the
`partition` aesthetic. This is an alias for the `adjust()` approach that is
intended to be clearer. It takes a specification for a partition that is
similar to `facet_wrap()`: either a one-sided formula or a call to `vars()`.

Let's try the fourth approach:

```{r lineribbon_blend}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_lineribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) |> partition(vars(cyl)) |> blend("multiply") +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "ggdist::stat_lineribbon() |> partition(vars(cyl)) |> blend('multiply')",
subtitle = "Overlapping lineribbons blend together independent of draw order.",
color = "cyl", fill = "cyl", y = "mpg"
)
```

Now the overlapping ribbons are blended together.

## Highlighting geoms using `copy_under()`

A common visualization technique to make a layer more salient (especially in the
presence of many other competing layers) is to add a small outline around
it. For some geometries (like `geom_point()`) this is easy; but for others (like `geom_line()`),
there's no easy way to do this without manually copying the layer.

The *ggblend* layer algebra makes this straightforward using the `adjust()` operation
combined with operator addition and multiplication. For example, given a layer
like:

```r
geom_line(linewidth = 1)
```

To add a white outline, you might want something like:

```r
geom_line(color = "white", linewidth = 2.5) + geom_line(linewidth = 1)
```

However, we'd rather not have to write the `geom_line()` specification twice
If we factor out the differences between the first and second layer, we can use
the `adjust()` operation (which lets you change the aesthetics and parameters
of a layer) along with the distributive law to factor out
`geom_line(linewidth = 1)` and write the above specification as:

```r
geom_line(linewidth = 1) * (adjust(color = "white", linewidth = 2.5) + 1)
```

The `copy_under(...)` operation, which is a synonym for `adjust(...) + 1`,
also implements this pattern:

```r
geom_line(linewidth = 1) * copy_under(color = "white", linewidth = 2.5)
```

Here's an example highlighting the fit lines from our previous lineribbon example:

```{r lineribbon_blend_highlight}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_ribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) |> partition(vars(cyl)) |> blend("multiply") +
geom_line(aes(y = median(mu_hat)), linewidth = 1) |> copy_under(color = "white", linewidth = 2.5) +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "geom_line() |> copy_under(color = 'white', linewidth = 2.5)",
subtitle = "Highlights the line layer without manually copying its specification.",
color = "cyl", fill = "cyl", y = "mpg"
)
```

Note that the implementation of `copy_under(...)` is simply a synonym for
`adjust(...) + 1`; we can see this if we look at `copy_under()` itself:

```{r}
copy_under()
```

In fact, not that it is particularly useful, but addition and multiplication
of layer operations is expanded appropriately:

```{r}
(adjust() + 3) * 2
```

I hesitate to imagine what that feature might be useful for...

## Compatibility with other packages

In theory *ggblend* should be compatible with other packages, though in more
complex cases (blending lists of geoms or using the `partition` aesthetic)
it is possible it may fail, as these features are a bit more hackish. I have
done some testing with a few other layer-manipulating packages---including
[gganimate](https://gganimate.com/), [ggnewscale](https://eliocamp.github.io/ggnewscale/),
and [relayer](https://github.com/clauswilke/relayer)---and they appear to be
compatible.

As a hard test, here is all three features applied to a modified version of the
Gapminder example used in the [gganimate documentation](https://gganimate.com/):

```{r gapminder, message=FALSE, warning=FALSE}
library(gganimate)
library(gapminder)

p = gapminder |>
ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
list(
geom_point(show.legend = c(size = FALSE)) |> partition(vars(continent)) |> blend("multiply"),
geom_hline(yintercept = 70, linewidth = 1.5, color = "gray75")
) |> blend("hard.light") +
scale_color_manual(
# same as colorspace::lighten(continent_colors, 0.35)
values = c(
Africa = "#BE7658", Americas = "#E95866", Asia = "#7C5C86",
Europe = "#659C5D", Oceania = "#7477CA"
),
guide = guide_legend(override.aes = list(size = 4))
) +
scale_size(range = c(2, 12)) +
scale_x_log10(labels = scales::label_dollar(scale_cut = scales::cut_short_scale())) +
scale_y_continuous(breaks = seq(20, 80, by = 10)) +
labs(
title = 'Gapminder with gganimate and ggblend',
subtitle = 'Year: {frame_time}',
x = 'GDP per capita',
y = 'Life expectancy'
) +
transition_time(year) +
ease_aes('linear')

animate(p, type = "cairo", width = 600, height = 400, res = 100)
```