https://github.com/romainfrancois/dance

tibble() dancing 💃
https://github.com/romainfrancois/dance

dance dplyr r rstats tibble

Last synced: 10 months ago
JSON representation

tibble() dancing 💃

Host: GitHub
URL: https://github.com/romainfrancois/dance
Owner: romainfrancois
License: other
Created: 2019-02-22T20:37:36.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2019-11-20T08:19:25.000Z (about 6 years ago)
Last Synced: 2025-04-13T00:42:21.223Z (10 months ago)
Topics: dance, dplyr, r, rstats, tibble
Language: R
Homepage:
Size: 428 KB
Stars: 45
Watchers: 4
Forks: 6
Open Issues: 17
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

awesome-dataframes - dance - Dancing 💃 with the stats, aka `tibble()` dancing 🕺. dance is a sort of reinvention of dplyr classic verbs, with a more modern stack underneath, i.e. it leverages a lot from vctrs and rlang. (Libraries)
jimsghstars - romainfrancois/dance - tibble() dancing 💃 (R)

README

          ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# dance 

[![Lifecycle Status](https://img.shields.io/badge/lifecycle-experimental-blue.svg)](https://www.tidyverse.org/lifecycle/)

[![Travis build status](https://travis-ci.org/romainfrancois/dance.svg?branch=master)](https://travis-ci.org/romainfrancois/dance)

![](https://media.giphy.com/media/mnLGTXoWVzAfm/giphy.gif)

Dancing `r emo::ji("woman_dancing")` with the stats, aka `tibble()` dancing `r emo::ji("man_dancing")`. 

`dance` is a sort of reinvention of `dplyr` classic verbs, with a more modern stack 

underneath, i.e. it leverages a lot from `vctrs` and `rlang`. 

# Installation

You can install the development version from GitHub.

```{r, eval=FALSE}

# install.packages("pak")

pak::pkg_install("romainfrancois/dance")

```

# Usage

We'll illustrate tibble dancing with `iris` grouped by `Species`. 

```{r example}

library(dance)

g <- iris %>% group_by(Species)

```

### waltz(), polka(), tango(), charleston()

These are in the neighborhood of `dplyr::summarise()`. 

`waltz()` takes a grouped tibble and a list of formulas and returns a tibble with: 

as many columns as supplied formulas, one row per group. It does not prepend the grouping 

variables (see `tango` for that). 

  

```{r}

g %>% 

  waltz(

    Sepal.Length = ~mean(Sepal.Length), 

    Sepal.Width  = ~mean(Sepal.Width)

  )

```

`polka()` deals with peeling off one layer of grouping: 

```{r}

g %>% 

  polka()

```

`tango()` binds the results of `polka()` and `waltz()` so is the closest to 

`dplyr::summarise()` 

```{r}

g %>% 

  tango(

    Sepal.Length = ~mean(Sepal.Length), 

    Sepal.Width  = ~mean(Sepal.Width)

  )

```

`charleston()` is like `tango` but it packs the new columns in a tibble: 

```{r}

g %>% 

  charleston(

    Sepal.Length = ~mean(Sepal.Length), 

    Sepal.Width  = ~mean(Sepal.Width)

  )

```

### swing, twist

There is no `waltz_at()`, `tango_at()`, etc ... but instead we can use 

either the same function on a set of columns or a set of functions on the same column. 

For this, we need to learn new dance moves: 

`swing()` and `twist()` are for applying the same function to a set 

of columns: 

```{r}

library(tidyselect)

g %>% 

  tango(swing(mean, starts_with("Petal")))

g %>% 

  tango(data = twist(mean, starts_with("Petal")))

```

They differ in the type of column is created and how to name them: 

 - `swing()` makes as many new columns as are selected by the tidy selection, and 

   the columns are named using a `.name` glue pattern, this way we might `swing()`

   several times. 

   

```{r}

g %>% 

  tango(

    swing(mean, starts_with("Petal"), .name = "mean_{var}"), 

    swing(median, starts_with("Petal"), .name = "median_{var}"), 

  )

```

 - `twist()` instead creates a single data frame column. 

 

```{r}

g %>% 

  tango(

    mean   = twist(mean, starts_with("Petal")), 

    median = twist(median, starts_with("Petal")), 

  )

```

The first arguments of `swing()` and `twist()` are either a function or a 

formula that uses `.` as a placeholder. Subsequent arguments are 

tidyselect selections. 

You can combine `swing()` and `twist()` in the same `tango()` or `waltz()`: 

```{r}

g %>% 

  tango(

    swing(mean, starts_with("Petal"), .name = "mean_{var}"), 

    median = twist(median, contains("."))

  )

```

### rumba, zumba

Similarly `rumba()` can be used to apply several functions to a single column. 

`rumba()` creates single columns and `zumba()` packs them into a data frame column. 

```{r}

g %>% 

  tango(

    rumba(Sepal.Width, mean = mean, median = median, .name = "Sepal_{fun}"), 

    Petal = zumba(Petal.Width, mean = mean, median = median)

  )

```

### salsa, chacha, samba, madison

Now we enter the realms of `dplyr::mutate()` with: 

 - `salsa()` : to create new columns

 - `chacha()`: to reorganize a grouped tibble so that data for each group is contiguous

 - `samba()` : `chacha()` + `salsa()` 

```{r}

g %>% 

  salsa(

    Sepal = ~Sepal.Length * Sepal.Width, 

    Petal = ~Petal.Length * Petal.Width

  )

```

You can `swing()`, `twist()`, `rumba()` and `zumba()` here too, and if you

want the original data, you can use `samba()` instead of `salsa()`: 

```{r}

g %>% 

  samba(centered = twist(~ . - mean(.), everything(), -Species))

```

`madison()` packs the columns `salsa()` would have created

```{r}

g %>% 

  madison(swing(~ . - mean(.), starts_with("Sepal")))

```

### bolero and mambo

`bolero()` is similar to `dplyr::filter()`. 

The formulas may be made by `mambo()` if you want to apply the same 

predicate to a tidyselection of columns: 

```{r}

g %>% 

  bolero(~Sepal.Width > 4)

g %>% 

  bolero(mambo(~. > 4, starts_with("Sepal")))

g %>% 

  bolero(mambo(~. > 4, starts_with("Sepal"), .op = or))

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/romainfrancois/dance

Awesome Lists containing this project

README