https://github.com/markjrieke/nplyr
nplyr: a grammar of (nested) data manipulation :bird:
https://github.com/markjrieke/nplyr
Last synced: 4 months ago
JSON representation
nplyr: a grammar of (nested) data manipulation :bird:
- Host: GitHub
- URL: https://github.com/markjrieke/nplyr
- Owner: markjrieke
- License: other
- Created: 2022-04-21T01:35:22.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-06-27T18:25:50.000Z (over 1 year ago)
- Last Synced: 2024-10-25T07:29:49.690Z (4 months ago)
- Language: R
- Homepage: https://markjrieke.github.io/nplyr/
- Size: 10.3 MB
- Stars: 118
- Watchers: 6
- Forks: 3
- Open Issues: 10
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - markjrieke/nplyr - nplyr: a grammar of (nested) data manipulation :bird: (R)
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
message = FALSE,
warning = FALSE
)
```# nplyr
**Author:** [Mark Rieke](https://www.thedatadiary.net/about/about.html)
**License:** [MIT](https://github.com/markjrieke/nplyr/blob/main/LICENSE)[](https://github.com/markjrieke/nplyr/actions)
[](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[](https://CRAN.R-project.org/package=nplyr)
[](https://cran.r-project.org/package=nplyr)## Overview
`{nplyr}` is a grammar of nested data manipulation that allows users to perform [dplyr](https://dplyr.tidyverse.org/)-like manipulations on data frames nested within a list-col of another data frame. Most dplyr verbs have nested equivalents in nplyr. A (non-exhaustive) list of examples:
* `nest_mutate()` is the nested equivalent of `mutate()`
* `nest_select()` is the nested equivalent of `select()`
* `nest_filter()` is the nested equivalent of `filter()`
* `nest_summarise()` is the nested equivalent of `summarise()`
* `nest_group_by()` is the nested equivalent of `group_by()`As of version 0.2.0, nplyr also supports nested versions of some [tidyr](https://tidyr.tidyverse.org/) functions:
* `nest_drop_na()` is the nested equivalent of `drop_na()`
* `nest_extract()` is the nested equivalent of `extract()`
* `nest_fill()` is the nested equivalent of `fill()`
* `nest_replace_na()` is the nested equivalent of `replace_na()`
* `nest_separate()` is the nested equivalent of `separate()`
* `nest_unite()` is the nested equivalent of `unite()`nplyr is largely a wrapper for dplyr. For the most up-to-date information on dplyr please visit [dplyr's website](https://dplyr.tidyverse.org). If you are new to dplyr, the best place to start is the [data transformation chapter](https://r4ds.had.co.nz/transform.html) in R for data science.
## Installation
You can install the released version of nplyr from CRAN or the development version from github with the [devtools](https://cran.r-project.org/package=devtools) or [remotes](https://cran.r-project.org/package=remotes) package:
```{r, eval=FALSE}
# install from CRAN
install.packages("nplyr")# install from github
devtools::install_github("markjrieke/nplyr")
```## Usage
To get started, we'll create a nested column for the country data within each continent from the [gapminder](https://CRAN.R-project.org/package=gapminder) dataset.
```{r}
library(nplyr)gm_nest <-
gapminder::gapminder_unfiltered %>%
tidyr::nest(country_data = -continent)gm_nest
```dplyr can perform operations on the top-level data frame, but with nplyr, we can perform operations on the nested data frames:
```{r}
gm_nest_example <-
gm_nest %>%
nest_filter(country_data, year == max(year)) %>%
nest_mutate(country_data, pop_millions = pop/1000000)# each nested tibble is now filtered to the most recent year
gm_nest_example# if we unnest, we can see that a new column for pop_millions has been added
gm_nest_example %>%
slice_head(n = 1) %>%
tidyr::unnest(country_data)
```nplyr also supports grouped operations with `nest_group_by()`:
```{r}
gm_nest_example <-
gm_nest %>%
nest_group_by(country_data, year) %>%
nest_summarise(
country_data,
n = n(),
lifeExp = median(lifeExp),
pop = median(pop),
gdpPercap = median(gdpPercap)
)gm_nest_example
# unnesting shows summarised tibbles for each continent
gm_nest_example %>%
slice(2) %>%
tidyr::unnest(country_data)
```More examples can be found in the package vignettes and function documentation.
## Bug reports/feature requests
If you notice a bug, want to request a new feature, or have recommendations on improving documentation, please [open an issue](https://github.com/markjrieke/nplyr/issues) in the package repository.