Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rpolars/r-polarssql

A Polars backend for R DBI and dbplyr
https://github.com/rpolars/r-polarssql

dplyr dplyr-sql-backends polars r

Last synced: 2 months ago
JSON representation

A Polars backend for R DBI and dbplyr

Awesome Lists containing this project

README

        

---
output:
github_document:
html_preview: false
---

```{r}
#| include: false

knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# polarssql

[![polarssql status badge](https://rpolars.r-universe.dev/badges/polarssql)](https://rpolars.r-universe.dev)
[![CRAN status](https://www.r-pkg.org/badges/version/polarssql)](https://CRAN.R-project.org/package=polarssql)

`{polarssql}` is an experimental DBI-compliant interface to [Polars](https://www.pola.rs/).

Polars is not an actual database, so does not support full `{DBI}` functionality.
Please check [the Polars User Guide](https://pola-rs.github.io/polars/user-guide/sql/intro/)
for supported SQL features.

## Installation

The [`polars`](https://rpolars.github.io/) R package and `{polarssql}`
can be installed from [R-universe](https://rpolars.r-universe.dev/):

```r
Sys.setenv(NOT_CRAN = "true") # for installing the polars package with pre-built binary
install.packages("polarssql", repos = c("https://rpolars.r-universe.dev", getOption("repos")))
```

## Example

```{r}
library(DBI)

con <- dbConnect(polarssql::polarssql())
dbWriteTable(con, "mtcars", mtcars)

# We can fetch all results:
res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4")
dbFetch(res)

# Clear the result
dbClearResult(res)

# Or a chunk at a time
res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4")
while (!dbHasCompleted(res)) {
chunk <- dbFetch(res, n = 5)
print(nrow(chunk))
}

# Clear the result
dbClearResult(res)

# We can use table functions to read files directly:
tf <- tempfile(fileext = ".parquet")
on.exit(unlink(tf))
polars::as_polars_lf(mtcars)$sink_parquet(tf)

dbGetQuery(con, paste0("SELECT * FROM read_parquet('", tf, "') ORDER BY mpg DESC LIMIT 3"))
```

`{polarssql}` also provides functions that are simpler to use, inspired by the `{duckdb}` package,

```{r}
library(polarssql)

# These functions use the built-in connection by default, so we don't need to specify connection

# Resgister a data.frame to the built-in connection
polarssql_register(df = mtcars)

# Get the query result as a polars LazyFrame
polarssql_query("SELECT * FROM df WHERE cyl = 4")

# Unregister the table
polarssql_unregister("df")
```

And, basic supports for `{dbplyr}` is also implemented.

```{r}
library(dplyr, warn.conflicts = FALSE)

# Resgister a data.frame to the built-in connection, and query it via dbplyr
tbl_polarssql(mtcars) |>
filter(cyl == 4) |>
arrange(desc(mpg)) |>
head(3) |>
compute()
```