https://github.com/jhelvy/factorlock
Set factors levels based on row order
https://github.com/jhelvy/factorlock
factors ggplot ggplot2 r rstats rstats-package tidyverse
Last synced: 5 months ago
JSON representation
Set factors levels based on row order
- Host: GitHub
- URL: https://github.com/jhelvy/factorlock
- Owner: jhelvy
- Created: 2023-05-10T15:24:30.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-10T20:36:34.000Z (over 2 years ago)
- Last Synced: 2025-08-18T08:02:09.227Z (5 months ago)
- Topics: factors, ggplot, ggplot2, r, rstats, rstats-package, tidyverse
- Language: R
- Homepage: https://jhelvy.github.io/factorlock/
- Size: 6.52 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
Awesome Lists containing this project
README
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
warning = FALSE,
message = FALSE,
comment = "#>",
fig.path = "man/figures/",
fig.retina = 3
)
```
[](https://CRAN.R-project.org/package=factorlock)
[](https://app.travis-ci.com/github/jhelvy/factorlock)
[](https://cran.r-project.org/package=factorlock)
# Installation
The current version is not yet on CRAN, but you can install it from
Github using the {remotes} package:
```{r, eval=FALSE}
# install.packages("remotes")
remotes::install_github("jhelvy/factorlock")
```
Load the library with:
```{r}
library(factorlock)
```
# Usage
The only thing this package does is provide the `factorlock::lock_factors()` function to make it easier to reorder factors in a data frame according to the row order.
By default, bar charts made with {ggplot2} follow alphabetical ordering:
```{r}
#| label: bars1
#| fig.width: 6
#| fig.height: 4
library(ggplot2)
library(dplyr)
mpg |>
count(manufacturer) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
```
If you wanted to sort the bars based on `n`, many people make the mistake of sorting the _data frame_ and assuming the reordered rows will pass through to the bars, like this:
```{r}
#| label: bars2
#| fig.width: 6
#| fig.height: 4
mpg |>
count(manufacturer) |>
arrange(n) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
```
**But this produces the same chart!** 🤦
Instead, to sort the bars based on `n` you have to reorder the **factor levels**:
```{r}
#| label: bars-reorder
#| fig.width: 6
#| fig.height: 4
mpg |>
count(manufacturer) |>
ggplot() +
geom_col(aes(x = n, y = reorder(manufacturer, n)))
```
I find this rather unintuitive and difficult to remember, let alone confusing because the ordering of the rows in the data frame won't match the factor ordering (which is rather opaque to the user).
{factorlock} provides an alternative approach by allowing you to "lock" the factor ordering to that of the row ordering in the data frame:
```{r}
#| label: bars-locked
#| fig.width: 6
#| fig.height: 4
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors() |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
```
Notice that you also get better axis label names for free here since the `y` variable is still mapped to just `manufacturer` instead of `reorder(manufacturer, n)`.
By default all character or factor type variables are "locked", but you can also specify which variables you want to "lock" while leaving the others alone:
```{r}
#| label: bars-locked-named
#| fig.width: 6
#| fig.height: 4
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors("manufacturer") |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
```
You can also get a reverse factor ordering with `rev = TRUE`:
```{r}
#| label: bars-locked-rev
#| fig.width: 6
#| fig.height: 4
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors(rev = TRUE) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
```
# Author, Version, and License Information
- Author: *John Paul Helveston* https://www.jhelvy.com/
- Date First Written: *May 10, 2023*
- License: [MIT](https://github.com/jhelvy/factorlock/blob/master/LICENSE.md)
# Citation Information
If you use this package for in a publication, I would greatly appreciate it if you cited it - you can get the citation by typing `citation("factorlock")` into R:
```{r}
citation("factorlock")
```
