Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eugejoh/pgtools

Simple tools for writing tables to PostgreSQL from R
https://github.com/eugejoh/pgtools

Last synced: 8 days ago
JSON representation

Simple tools for writing tables to PostgreSQL from R

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, echo=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-",
eval = FALSE
)
```

# pgtools
Standardized workflow for writing tables to PostgreSQL. This package contains tools to provide a consistent workflow for
writing `data.frames` existing in an `R` session to a PostgreSQL database connection. The tools are built around the [`DBI`](https://github.com/r-dbi)
and [`RPostgres`](https://github.com/r-dbi/RPostgres) packages.

**What this package provides?**

- Convenient connection to PostgreSQL with credentials
- An automated assignment of PostgreSQL field type variable lengths based on a basic assessment of element lengths
- Schema specification for database writing
- Simple genration of a `SQL` `CREATE TABLE` statements
- Vectorized to accept a `list` of `data.frames` to write to PostgreSQL

This is created for primary use among the Data/Programming team at CGHR to standardize and optimize the data storage and management workflow.

## Installation
```{r}
library(devtools)
devtools::install_github("eugejoh/pgtools")
```

## Database Connection
Best practices with storing and using credentials, steps [**here**](https://db.rstudio.com/best-practices/managing-credentials/#use-environment-variables) should be followed to access the `.REnviron` file. By default `pgtools::connect_pg()` searches your `.REnviron` file to retrieve the appropriate information.

The `.REnviron` file should have the following variables:
```{r}
db_ip='my ip here'

db_user='my username here'

db_pw='my password here'

db_name='my database name here'
```

Those familiar with the [`tidyverse`](https://www.tidyverse.org/) can use the [`usethis`](https://github.com/r-lib/usethis) to easily access and edit the `.REnviron` file. See [this page](https://usethis.r-lib.org/reference/edit.html) for more details.

## Typical Workflow
This section outlines a typical workflow of writing a data frame from a R session to a PostgreSQL database connection.

```{r diagrammer_flowchart, eval = TRUE}

```

Example for single data frame using the `iris` dataset:
```{r}
# Single data frame
data(iris)

# Connect to database
my_conn <- connect_pg(getenv = FALSE,
drv = DBI::dbDriver("Postgres"),
host = "myhostname",
port = 5432,
dbname = "mydb",
user = "myusername",
password = "mypw"
)

# Element lengths
my_nchar <- get_nchar(iris)

# Postgres Field Types
my_fields <- set_pgfields(nchar_df, conn = my_conn)

# Write to Postgres
write_pgtable(
input = iris,
tbl_name = "iris",
field.types = my_fields,
conn = my_conn
)

```

Example for a list of data frames:
```{r}
# Mulitple data frames
data(iris)
data(swiss)
data(mtcars)
data(cars)

# Named list of data frames
my_list <- list(iris, swiss, mtcars, cars)
names(my_list) <- c("iris", "swiss", "mtcars", "cars")

# Connect to database
my_conn <- connect_pg(getenv = FALSE,
drv = DBI::dbDriver("Postgres"),
host = "myhostname",
port = 5432,
dbname = "mydb",
user = "myusername",
password = "mypw"
)

# Element lengths
my_nchar <- get_nchar(my_list)

# Postgres Field Types
my_fields <- set_pgfields(nchar_df, conn = my_conn)

# Write to Postgres
write_pgtable(
input = my_list,
field.types = my_fields,
conn = my_conn
)

```

Example obtaining the `SQL` statement for `CREATE TABLE` with added primary key:
```{r}
data(iris)

# Connect to database
my_conn <- connect_pg(getenv = FALSE,
drv = DBI::dbDriver("Postgres"),
host = "myhostname",
port = 5432,
dbname = "mydb",
user = "myusername",
password = "mypw"
)

# Add ID for Primary Key
iris$id <- seq_along(1:nrow(iris))

# Element lengths
my_nchar <- get_nchar(iris)

# Postgres Field Types
my_fields <- set_pgfields(nchar_df, conn = my_conn)

get_sql_create(my_pg_fields, pkey = "id", tbl_name = "iris")
```

Example of adding comments to the table and fields:
```{r}
data(iris)

my_conn <- connect_pg(getenv = FALSE,
drv = DBI::dbDriver("Postgres"),
host = "myhostname",
port = 5432,
dbname = "mydb",
user = "myusername",
password = "mypw"
)

tab_comment <- paste0("I love the `iris` dataset. ", "added: ", Sys.Date())

my_comments <- c("length of the sepal", "width of the sepal", "length of the petal", "width of the petal", "the type of flower species")
names(my_comments) <- names(iris) #remember to name the input for comments!

write_pgtable(
input = iris,
tbl_name = "iris",
schema = schema_name,
conn = my_conn,
tbl.comments = tab_comment,
field.comments = my_comments,
clean_vars = TRUE
)

```