https://github.com/daroczig/dbr

Secure database connections and convenient queries from R on the top of DBI
https://github.com/daroczig/dbr

Last synced: 3 months ago
JSON representation

Secure database connections and convenient queries from R on the top of DBI

Host: GitHub
URL: https://github.com/daroczig/dbr
Owner: daroczig
Created: 2018-07-12T03:04:08.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2020-06-12T22:01:06.000Z (about 5 years ago)
Last Synced: 2025-04-10T06:59:32.701Z (3 months ago)
Language: R
Homepage: https://daroczig.github.io/dbr
Size: 147 KB
Stars: 32
Watchers: 2
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

jimsghstars - daroczig/dbr - Secure database connections and convenient queries from R on the top of DBI (R)

README

        # dbr: Convenient database connections and queries from R

[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) ![CRAN](https://www.r-pkg.org/badges/version/dbr) [![Build Status](https://travis-ci.org/daroczig/dbr.svg?branch=master)](https://travis-ci.org/daroczig/dbr) [![Code Coverage](https://codecov.io/gh/daroczig/dbr/branch/master/graph/badge.svg)](https://codecov.io/gh/daroczig/dbr)

Vignette coming, until then, please check the talk presented at the useR! 2018 conference:

* [slides](http://bit.ly/user2018-dbr)

* [video](https://www.youtube.com/watch?v=z7x4UOHNguY)

## Setting up a config file for the database connections

To be able to connect to a database, the connection parameters are to be specified in a YAML file, for example for a SQLite database to be created in a temp file:

```yaml

sqlite:

  drv: !expr RSQLite::SQLite()

  dbname: !expr tempfile()

```

By default, `dbr` will look for a file named `db_config.yaml` in the current working directory, that can be override via the `dbr.db_config_path` global option, eg to the example config bundled in this package:

```r

options(dbr.db_config_path = system.file('example_db_config.yaml', package = 'dbr'))

```

A more complex example from the demo YAML file describing a MySQL connection to a database hosted by RStudio (with public username and password):

```yaml

shinydemo:

  drv: !expr RMySQL::MySQL()

  host: shiny-demo.csa7qlmguqrf.us-east-1.rds.amazonaws.com

  username: guest

  password: guest

  dbname: shinydemo

```

Note, that instead of simple strings, you can also specify KMS-encrypted passwords, other secrets and parameters as well, eg:

```yaml

redshift:

  host: !aws_kms |

    KMSencryptedciphertext...

  port: 5439

  dbname: dbname

  user: username

  drv: !expr RPostgreSQL::PostgreSQL()

  password: !aws_kms |

    KMSencryptedciphertext...

  s3_copy_bucket: !attr |-

    's3://openmail-model/temp'

  s3_copy_iam_role: !attr |-

    arn:aws:iam::accountid:role/redshift_role

```

## Querying databases

Once the connection parameters are loaded from a config file, making SQL queries are as easy as specifying the SQL statement and the name of the connection:

```r

db_query('show tables', 'shinydemo')

#> INFO [2019-01-06 01:06:18] Connecting to shinydemo

#> INFO [2019-01-06 01:06:19] Executing:**********

#> INFO [2019-01-06 01:06:19] show tables

#> INFO [2019-01-06 01:06:19] ********************

#> INFO [2019-01-06 01:06:19] Finished in 0.1336 secs returning 3 rows

#> INFO [2019-01-06 01:06:19] Closing connection to shinydemo

#>   Tables_in_shinydemo

#> 1                City

#> 2             Country

#> 3     CountryLanguage

```

For more advanced usage, eg caching database connections, check `?db_connect` and the above mentioned vignette.

## SQL templating

To reuse SQL chunks, you may list your SQL queries (or parts of it) in a structured YAML file, like in the bundled example config at [`example_sql_chunks.yaml`](https://github.com/daroczig/dbr/blob/master/inst/example_sql_chunks.yaml)

Use `sql_chunk_files` to list or update the currently used SQL template YAML file(s), eg via

```r

sql_chunk_files(system.file('example_sql_chunks.yaml', package = 'dbr'))

```

Then you may refer to any key in that definition by a string that consist of the keys in hierarchy separated by a dot, so looking at the below definition (part of [`example_sql_chunks.yaml`](https://github.com/daroczig/dbr/blob/master/inst/example_sql_chunks.yaml)):

```yaml

dbr:

  shinydemo:

    countries:

      count: SELECT COUNT(*) FROM Country

```

Getting the `count` key from for the `countries` item in `dbr`'s `shinydemo` section, you could do something like:

```r

sql_chunk('dbr.shinydemo.countries.count')

#> SELECT COUNT(*) FROM Country

```

And pass it right away to `db_query`:

```r

countries <- db_query(sql_chunk('dbr.shinydemo.countries.count'), 'shinydemo')

#> INFO [2019-01-06 01:33:33] Connecting to shinydemo

#> INFO [2019-01-06 01:33:34] Executing:**********

#> INFO [2019-01-06 01:33:34] SELECT COUNT(*) FROM Country

#> INFO [2019-01-06 01:33:34] ********************

#> INFO [2019-01-06 01:33:34] Finished in 0.1291 secs returning 1 rows

#> INFO [2019-01-06 01:33:34] Closing connection to shinydemo

```

SQL chunks can be also defined in files outside of the YAML with the `sql` file extensions, and referenced with the `!include` tag in the YAML file, eg:

```yaml

dbr:

  shinydemo:

    countries:

      europe: !include europe.sql

```

This will read the content of [`europe.sql`](https://github.com/daroczig/dbr/blob/master/inst/europe.sql) and make it available as `sql_chunk('dbr.shinydemo.countries.count')`.

Besides files, a folder with `sql` files can be also included -- in that case, the base filename (without the `sql` file extension) will become the key under the given key. For example, consider this YAML definition:

```yaml

cities: !include cities.sql

```

Will load all the files from the [`cities.sql`](https://github.com/daroczig/dbr/tree/master/inst/cities.sql) folder and make those available under `europe`, so resulting in an intermediate YAML as:

```

cities: !include cities.sql

  europe: |-

    SELECT Name

    FROM City

    WHERE CountryCode IN (

      {sql_chunk('dbr.shinydemo.countries.europe', indent_after_linebreak = 2)})

  europe_large: |-

    SELECT Name

    FROM City

    WHERE

      Population > 1000000 AND

      Name IN (

        {sql_chunk('dbr.shinydemo.cities.europe', indent_after_linebreak = 4)}))

```

If the key of a directory `!include` is `~!`, then the keys are made available in the parent node, so eg

```

cities:

  ~!: !include cities.sql

```

Would not actually create the `cities` key, but only the `europe` and `europe_large` keys in the root node.

As you can see from the above, the main power of this templating approach is that you can easily reuse SQL chunks, eg for the list of European countries in:

```r

cities <- db_query(sql_chunk('dbr.shinydemo.cities.europe'), 'shinydemo')

#> INFO [2019-01-06 01:32:02] Connecting to shinydemo

#> INFO [2019-01-06 01:32:02] Executing:**********

#> INFO [2019-01-06 01:32:02] SELECT Name

#> FROM City

#> WHERE CountryCode IN (

#>   SELECT Code

#>   FROM Country

#>   WHERE Continent = 'Europe')

#> INFO [2019-01-06 01:32:02] ********************

#> INFO [2019-01-06 01:32:02] Finished in 0.1225 secs returning 643 rows

#> INFO [2019-01-06 01:32:02] Closing connection to shinydemo

```

Where the `Country`-related subquery was specified in the `dbr.shinydemo.countries.europe` key as per:

```sql

SELECT Name

FROM City

WHERE CountryCode IN (

  {sql_chunk('dbr.shinydemo.countries.europe', indent_after_linebreak = 2)})

```

The `indent_after_linebreak` parameter is just for cosmetic updates in the query to align `FROM` and `WHERE` on the same character in the SQL statement.

Even more complex / nested example:

```sql

sql_chunk('dbr.shinydemo.cities.europe_large')

#> SELECT Name

#> FROM City

#> WHERE

#>   Population > 1000000 AND

#>   Name IN (

#>     SELECT Name

#>     FROM City

#>     WHERE CountryCode IN (

#>       SELECT Code

#>       FROM Country

#>       WHERE Continent = 'Europe')))

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/daroczig/dbr

Awesome Lists containing this project

README