An open API service indexing awesome lists of open source software.

https://github.com/nacnudus/armgin

R package to summarise all margins of data frames
https://github.com/nacnudus/armgin

Last synced: 15 days ago
JSON representation

R package to summarise all margins of data frames

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

[![lifecycle](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#retired)
[![Travis build status](https://travis-ci.org/nacnudus/armgin.svg?branch=master)](https://travis-ci.org/nacnudus/armgin)
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/nacnudus/armgin?branch=master&svg=true)](https://ci.appveyor.com/project/nacnudus/armgin)
[![Coverage status](https://codecov.io/gh/nacnudus/armgin/branch/master/graph/badge.svg)](https://codecov.io/github/nacnudus/armgin?branch=master)
[![CRAN status](https://www.r-pkg.org/badges/version/armgin)](https://cran.r-project.org/package=armgin)

# armgin

The armgin package summarises all combinations of grouping variables at once.
For example, vote counts by

```
* voting place, ward, constituency, party, candidate
* voting place, ward, constituency, party
* voting place, ward, constituency
* voting place, ward,
* voting place
```

Or with `method = "combination"` and grouping by only voting place, ward and
constituency (for brevity)

```
* voting place, ward, constituency

* voting place, ward
* voting place, constituency
* ward, constituency

* voting place
* ward
* constituency
```

Groups are defined as normal by `dplyr::group_by()` and then piped into
`armgin::margins()`.

```{r}
library(dplyr)
library(armgin)

mtcars %>%
group_by(cyl, gear, am) %>%
margins(mpg = mean(mpg),
hp = min(hp)) %>%
print(n = Inf)
```

Output individual data frames with `bind = FALSE`.

```{r}
mtcars %>%
group_by(cyl, gear, am) %>%
margins(mpg = mean(mpg),
hp = min(hp),
bind = FALSE)
```

## Caveat

Please don't use this package. You can easily implement the functions yourself.
If you really want to use a package then use
[groupie](https://github.com/nacnudus/groupie), which is the same but better.

The trick is to treat sets of grouping variables as 'data', map over them to
create copies of your actual data grouped by each set of variables, and then map
again with a `summarise()` function or whatever.

```{r}
library(dplyr)
library(purrr)

grouping_vars <- c("vs", "am", "carb")
map(grouping_vars, group_by_at, .tbl = mtcars) %>%
map(summarise,
'6 cylinder' = sum(cyl == 6),
'Large disp' = sum(disp >= 100),
'low gears' = sum(gear <= 4))
```

The [groupie](https://github.com/nacnudus/groupie) package also wraps the
[arrangements](https://cran.r-project.org/package=arrangements) package to
create sets of grouping variables. This again is very simple to do yourself.

## Installation

You can install the development version from GitHub with devtools or remotes.

```r
install.packages("devtools")
devtools::install_github("nacnudus/armgin")
```

## Motivating example

What are the salaries of various jobs, grades and professions in the UK's Joint
Nature Conservation Committee?

```{r}
library(armgin)

library(dplyr)
library(tidyr)
library(readr)

# organogram.csv is from "https://data.gov.uk/sites/default/files/organogram/joint-nature-conservation-committee/30/9/2018/20180930%20JNCC-junior.csv",

organogram <-
read_csv("organogram.csv") %>%
replace_na(list(Grade = "Other"))
```

It could be a hierarchy with the whole JNCC Organisation at the top, then Grade
within that, then Profession as the bottom grouping.

```{r}
organogram %>%
group_by(Organisation, Grade, `Professional/Occupational Group`) %>%
margins(min = min(`Payscale Minimum (£)`),
max = max(`Payscale Maximum (£)`),
fte = sum(`Number of Posts in FTE`)) %>%
mutate_if(is.character, replace_na, "All") %>%
print(n = Inf)
```

Or it could be a different hierarchy, with Grade grouped inside Profession.

```{r}
organogram %>%
group_by(Organisation, `Professional/Occupational Group`, Grade) %>%
margins(min = min(`Payscale Minimum (£)`),
max = max(`Payscale Maximum (£)`),
fte = sum(`Number of Posts in FTE`)) %>%
mutate_if(is.character, replace_na, "All") %>%
print(n = Inf)
```

Or there could be no hierarchy, with all combinations of Organisation, Grade and
Profession equally valid subsets.

```{r}
organogram %>%
group_by(Organisation, Grade, `Professional/Occupational Group`) %>%
margins(min = min(`Payscale Minimum (£)`),
max = max(`Payscale Maximum (£)`),
fte = sum(`Number of Posts in FTE`),
method = "combination") %>%
mutate_if(is.character, replace_na, "All") %>%
print(n = Inf)
```

Or each one of Organisation, Grade and Profession could be summarised
individually.

```{r}
organogram %>%
group_by(Organisation, Grade, `Professional/Occupational Group`) %>%
margins(min = min(`Payscale Minimum (£)`),
max = max(`Payscale Maximum (£)`),
fte = sum(`Number of Posts in FTE`),
method = "individual") %>%
mutate_if(is.character, replace_na, "All") %>%
print(n = Inf)
```

## Contributing

Please note that the 'armgin' project is released with a [Contributor Code of
Conduct](.github/CODE_OF_CONDUCT.md). By contributing to this project, you agree
to abide by its terms.