Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rstudio/connections

https://rstudio.github.io/connections/
https://github.com/rstudio/connections

connection-pane database-connection pins r rstudio

Last synced: 3 months ago
JSON representation

https://rstudio.github.io/connections/

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
if (dir.exists("~/pins")) unlink("~/pins", recursive = TRUE)

knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
eval = TRUE
)

add_figure <- function(path, width = 400) {
char_html <- paste0("
")
htmltools::HTML(char_html)
}

toc <- function() {
re <- readLines("README.Rmd")
has_title <- as.logical(lapply(re, function(x) substr(x, 1, 2) == "##"))
only_titles <- re[has_title]
titles <- trimws(gsub("#", "", only_titles))
links <- trimws(gsub("`", "", titles))
links <- tolower(links)
links <- trimws(gsub(" ", "-", links))
links <- trimws(gsub(",", "", links))
toc_list <- lapply(
seq_along(titles),
function(x) {
pad <- ifelse(substr(only_titles[x], 1, 3) == "###", " - ", " - ")
paste0(pad, "[", titles[x], "](#", links[x], ")")
}
)
toc_full <- paste(toc_list, collapse = "\n")
cat(toc_full)
}
```

# connections

[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![R-CMD-check](https://github.com/rstudio/connections/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/rstudio/connections/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/rstudio/connections/branch/main/graph/badge.svg)](https://app.codecov.io/gh/rstudio/connections?branch=main)
[![CRAN status](https://www.r-pkg.org/badges/version/connections)](https://CRAN.R-project.org/package=connections)

- [Installation](#installation)
- [Functions](#functions)
- [Uploading and referencing tables with `dplyr`](#uploading-and-referencing-tables-with-dplyr)
- [`pins`](#pins)
- [Pin a database connection](#pin-a-database-connection)
- [Pin a `dplyr` database query](#pin-a-dplyr-database-query)
- [Full `pins` example](#full-pins-example)
- [Back-end examples](#back-end-examples)
- [BigQuery, via `bigrquery`](#bigquery-via-bigrquery)
- [PostgreSQL, via `RPostgres`](#postgresql-via-rpostgres)
- [`DBI` connections](#dbi-connections)

The main goal of `connections` is to integrate `DBI`-compliant packages with the
RStudio IDE's [Connection
Pane](https://solutions.posit.co/connections/db/tooling/connections/). Packages such as
[RPostgres](https://github.com/r-dbi/RPostgres), [RSQLite](https://github.com/r-dbi/RSQLite),
[RMariaDB](https://github.com/r-dbi/RMariaDB) and
[bigrquery](https://github.com/r-dbi/bigrquery) connect R to those databases, but
do not provide a direct integration with the Connections Pane. `connections`
reads the configuration of the connection and creates the integration with RStudio.

A second goal is to provide integration with the [pins](https://pins.rstudio.com/)
package. The `connections` package allows you to pin database connections and
[dplyr](https://dplyr.tidyverse.org/) table objects.

## Installation

Install the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("remotes")
remotes::install_github("rstudio/connections")
```
```{r setup, include = FALSE}
library(connections)
library(RSQLite)
library(dplyr)
library(pins)

if (file.exists("local.sqlite")) unlink("local.sqlite")
```

## Functions

The two main functions added by `connections` are:

- `connection_open()` - Opens the database connection. Use instead of
`dbConnect()`, but use the exact same arguments. It also automatically starts
the Connections pane.
- `connection_close()` - Closes the database connection.

```{r}
library(connections)
library(RSQLite)

con <- connection_open(SQLite(), "local.sqlite")
```

`r add_figure("man/figures/connection-1.png")`

The connection can now be closed by using the appropriate button in the
Connections pane, or by using `connection_close()`

```{r}
connection_close(con)
```

`r add_figure("man/figures/connection-2.png")`

The connection code is parsed when connecting to the database, and it is
visible once the connection is closed.

## Uploading and referencing tables with `dplyr`

`connections` integrates with `dplyr` by supporting the following two functions:

- `tbl()` - To create a pointer to a table or view within the database.
- `copy_to()` - To copy data from the R session to the database.

The version of `copy_to()` inside `connections` automatically updates the
Connections pane, so the new table automatically shows up.

```{r}
con <- connection_open(SQLite(), "local.sqlite")

copy_to(con, mtcars, temporary = FALSE, overwrite = TRUE)
```

To use an existing table inside the database use `tbl()`.

```{r}
db_mtcars <- tbl(con, "mtcars")
```

`r add_figure("man/figures/pane-1.png")`

The `tbl()` function opens the rest of the already available `dplyr`
database integration.

```{r}
db_mtcars %>%
group_by(am) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))
```

## `pins`

The `connections` package integrates with `pins`. It adds the ability to "pin"
database connections and queries. It follows the same approach as the
[vetiver](https://rstudio.github.io/vetiver-r/reference/vetiver_pin_write.html)
package. `connections` now has two new functions:

- `connection_pin_write()`
- `connection_pin_read()`

### Pin a database connection

The `connection_pin_write()` function does **not** save the R object. It records
the code necessary to recreate the connection.

```{r}
library(pins)
board <- board_folder("~/pins")

connection_pin_write(board, con, name = "my_conn")
```

`r add_figure("man/figures/pins-1.png")`

If you wish to see the code that `connections` will use when recreating the
conneciton from the pin, you can use `connection_code()`:

```{r}
connection_code(con)
```

`connection_pin_read()` will replay the exact same code used to initially
connect to the database. Assign the output to a variable, such as
`con1`. The variable will work just like any connection variable.

```{r}
con1 <- connection_pin_read(board, "my_conn")
```

The `con1` variable is now a regular database connection variable.

```{r}
db_mtcars <- tbl(con1, "mtcars") %>%
group_by(am) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))

db_mtcars
```

### Pin a `dplyr` database query

When `dplyr` works with database data, the resulting query is not executed until
the data is explicitly collected into R, or when printing the top results to the
R Console. The `pin` records two things:

- The `dplyr` R object that contains all of the transformations. **It does not
save the actual results**.

- The necessary information to recreate the database connection. This is to make
sure that the data is being retrieved from the original database connection.

```{r}
connection_pin_write(board, db_mtcars, name = "avg_mpg")
```

`r add_figure("man/figures/pins-2.png")`

`connection_pin_read()` will connect to the database, and return the `dplyr` object.
Without assigning it to a variable, the pin will immediately print the results of the
database. Those results are being processed at the time `connection_pin_read()` runs.

```{r}
connection_pin_read(board, "avg_mpg")
```

### Full `pins` example

The way `pins` integrates with databases, via the `connections` package, allows
to open the connection from a pin, and pipe all of the subsequent code into a
new pin. Afterwards, that pin can be used to collect or to continue using the
`dplyr` object.

```{r}
board <- board_folder("~/pins")

con <- connection_pin_read(board, "my_conn")

tbl_summary <- con %>%
tbl("mtcars") %>%
group_by(cyl) %>%
summarise(avg_mpg = mean(mpg, na.rm = TRUE))

connection_pin_write(board, tbl_summary, name = "cyl_mpg")

connection_close(con)

connection_pin_read(board, "cyl_mpg")
```

`r add_figure("man/figures/pins-3.png")`

## Back-end examples

There are a couple of examples of how the Connections pane will look when
opening the connection via `connections`.

### BigQuery, via `bigrquery`

```{r, eval = FALSE}
library(connections)
library(bigrquery)

con <- connection_open(
bigquery(),
project = "bigquery-public-data",
dataset = "austin_311",
billing = "my_project_billing",
use_legacy_sql = FALSE
)
```

`r add_figure("man/figures/bigquery-1.png")`

```{r, eval = FALSE}
connection_close(con)
```

`r add_figure("man/figures/bigquery-2.png")`

### PostgreSQL, via `RPostgres`

```{r, eval = FALSE}
library(connections)
library(RPostgres)
con <- connection_open(
Postgres(),
host = "localhost",
dbname = "datawarehouse",
user = "[user id]",
password = "[password]",
bigint = "integer",
port = "5432"
)
```

`r add_figure("man/figures/postgres-1.png")`

## `DBI` connections

It is possible to integrate `DBI` connections not opened via `connection_open()`.
To do that, use `connection_view()` and pass it the variable containing the
existing database connection.

```{r, eval = FALSE}
library(DBI)

con <- dbConnect(RSQLite::SQLite(), ":memory:")

connection_view(con)
```

`r add_figure("man/figures/dbi-1.png", 200)`

Changes to the database will not automatically load in the Connections pane.
The `connection_update()` function will refresh the pane with the latest.

```{r, eval = FALSE}
dbWriteTable(con, "mtcars", mtcars)

connection_update(con)
```

`r add_figure("man/figures/dbi-2.png", 300)`

```{r, eval = FALSE}
connection_close(con)
```

`r add_figure("man/figures/dbi-3.png", 300)`