https://github.com/etiennebacher/tidypolars
Get the power of polars with the syntax of the tidyverse
https://github.com/etiennebacher/tidypolars
Last synced: 18 days ago
JSON representation
Get the power of polars with the syntax of the tidyverse
- Host: GitHub
- URL: https://github.com/etiennebacher/tidypolars
- Owner: etiennebacher
- License: other
- Created: 2023-06-02T17:04:15.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T09:32:34.000Z (23 days ago)
- Last Synced: 2025-04-01T22:18:49.099Z (21 days ago)
- Language: R
- Homepage: https://tidypolars.etiennebacher.com
- Size: 13.6 MB
- Stars: 199
- Watchers: 4
- Forks: 5
- Open Issues: 10
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.Rmd
- License: LICENSE
Awesome Lists containing this project
- awesome-polars - tidypolars for R
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```[](https://github.com/etiennebacher/tidypolars/actions/workflows/R-CMD-check.yml)
[](https://etiennebacher.r-universe.dev/tidypolars)
[](https://app.codecov.io/gh/etiennebacher/tidypolars?branch=main)---
:information_source: This is the R package "tidypolars". The Python one is here: [markfairbanks/tidypolars](https://github.com/markfairbanks/tidypolars)
---
## Overview
`tidypolars` provides a [`polars`](https://rpolars.github.io/) backend for the
`tidyverse`. The aim of `tidypolars` is to enable users to keep their existing
`tidyverse` code while using `polars` in the background to benefit from large
performance gains. The only thing that needs to change is the way data is
imported in the R session.See the ["Getting started" vignette](https://tidypolars.etiennebacher.com/articles/tidypolars)
for a gentle introduction to `tidypolars`.Since most of the work is rewriting `tidyverse` code into `polars` syntax,
`tidypolars` and `polars` have very similar performance.Click to see a small benchmark
The main purpose of this benchmark is to show that `polars` and `tidypolars` are
close and to give an idea of the performance. For more thorough, representative
benchmarks about `polars`, take a look at [DuckDB benchmarks](https://duckdblabs.github.io/db-benchmark/) instead.```{r}
library(collapse, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
library(dtplyr)
library(polars)
library(tidypolars)large_iris <- data.table::rbindlist(rep(list(iris), 100000))
large_iris_pl <- as_polars_lf(large_iris)
large_iris_dt <- lazy_dt(large_iris)format(nrow(large_iris), big.mark = ",")
bench::mark(
polars = {
large_iris_pl$
select(c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"))$
with_columns(
pl$when(
(pl$col("Petal.Length") / pl$col("Petal.Width") > 3)
)$then(pl$lit("long"))$
otherwise(pl$lit("large"))$
alias("petal_type")
)$
filter(pl$col("Sepal.Length")$is_between(4.5, 5.5))$
collect()
},
tidypolars = {
large_iris_pl |>
select(starts_with(c("Sep", "Pet"))) |>
mutate(
petal_type = ifelse((Petal.Length / Petal.Width) > 3, "long", "large")
) |>
filter(between(Sepal.Length, 4.5, 5.5)) |>
compute()
},
dplyr = {
large_iris |>
select(starts_with(c("Sep", "Pet"))) |>
mutate(
petal_type = ifelse((Petal.Length / Petal.Width) > 3, "long", "large")
) |>
filter(between(Sepal.Length, 4.5, 5.5))
},
dtplyr = {
large_iris_dt |>
select(starts_with(c("Sep", "Pet"))) |>
mutate(
petal_type = ifelse((Petal.Length / Petal.Width) > 3, "long", "large")
) |>
filter(between(Sepal.Length, 4.5, 5.5)) |>
as.data.frame()
},
collapse = {
large_iris |>
fselect(c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")) |>
fmutate(
petal_type = data.table::fifelse((Petal.Length / Petal.Width) > 3, "long", "large")
) |>
fsubset(Sepal.Length >= 4.5 & Sepal.Length <= 5.5)
},
check = FALSE,
iterations = 40
)# NOTE: do NOT take the "mem_alloc" results into account.
# `bench::mark()` doesn't report the accurate memory usage for packages calling
# Rust code.
```## Installation
`tidypolars` is built on `polars`, which is not available on CRAN. This means
that `tidypolars` also can't be on CRAN. However, you can install it from
R-universe.```{r eval=FALSE}
Sys.setenv(NOT_CRAN = "true")
install.packages("tidypolars", repos = c("https://community.r-multiverse.org", 'https://cloud.r-project.org'))
```## Contributing
Did you find some bugs or some errors in the documentation? Do you want
`tidypolars` to support more functions?Take a look at the [contributing guide](https://tidypolars.etiennebacher.com/CONTRIBUTING.html) for instructions
on bug report and pull requests.## Acknowledgements
The website theme was heavily inspired by Matthew Kay's `ggblend` package: https://mjskay.github.io/ggblend/.
The package hex logo was created by Hubert Hałun as part of the Appsilon Hex
Contest.