Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/r-dbi/bigrquery

An interface to Google's BigQuery from R.
https://github.com/r-dbi/bigrquery

bigquery database r

Last synced: 14 days ago
JSON representation

An interface to Google's BigQuery from R.

Host: GitHub
URL: https://github.com/r-dbi/bigrquery
Owner: r-dbi
License: other
Created: 2013-05-22T14:04:16.000Z (about 11 years ago)
Default Branch: main
Last Pushed: 2024-06-14T16:35:14.000Z (22 days ago)
Last Synced: 2024-06-19T00:35:21.258Z (18 days ago)
Topics: bigquery, database, r
Language: R
Homepage: https://bigrquery.r-dbi.org
Size: 6.62 MB
Stars: 507
Watchers: 42
Forks: 180
Open Issues: 12
Metadata Files:
- Readme: README.Rmd
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Support: .github/SUPPORT.md

Lists

awesome-bigquery - bigrquery
jimsghstars - r-dbi/bigrquery - An interface to Google's BigQuery from R. (R)

README

        ---

output: github_document

---

```{r setup, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

if (bigrquery:::has_internal_auth()) {

  bigrquery:::bq_auth_internal()

} else {

  knitr::opts_chunk$set(eval = FALSE)

}

```

# bigrquery

[![CRAN Status](https://www.r-pkg.org/badges/version/bigrquery)](https://cran.r-project.org/package=bigrquery)

[![R-CMD-check](https://github.com/r-dbi/bigrquery/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/r-dbi/bigrquery/actions/workflows/R-CMD-check.yaml)

[![Codecov test coverage](https://codecov.io/gh/r-dbi/bigrquery/branch/main/graph/badge.svg)](https://app.codecov.io/gh/r-dbi/bigrquery?branch=main)

The bigrquery package makes it easy to work with data stored in 

[Google BigQuery](https://cloud.google.com/bigquery/docs) by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

* The low-level API provides thin wrappers over the underlying REST API. All 

  the low-level functions start with `bq_`, and mostly have the form 

  `bq_noun_verb()`. This level of abstraction is most appropriate if you're

  familiar with the REST API and you want do something not supported in the 

  higher-level APIs.

  

* The [DBI interface](https://r-dbi.org) wraps the low-level API and

  makes working with BigQuery like working with any other database system.

  This is most convenient layer if you want to execute SQL queries in 

  BigQuery or upload smaller amounts (i.e. <100 MB) of data.

* The [dplyr interface](https://dbplyr.tidyverse.org/) lets you treat BigQuery 

  tables as if they are in-memory data frames. This is the most convenient 

  layer if you don't want to write SQL, but instead want dbplyr to write it 

  for you.

## Installation

The current bigrquery release can be installed from CRAN: 

```{r eval = FALSE}

install.packages("bigrquery")

```

The newest development release can be installed from GitHub:

```{r eval = FALSE}

#install.packages("pak")

pak::pak("r-dbi/bigrquery")

```

## Usage

### Low-level API

```{r}

library(bigrquery)

billing <- bq_test_project() # replace this with your project ID 

sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"

tb <- bq_project_query(billing, sql)

bq_table_download(tb, n_max = 10)

```

### DBI

```{r, warning = FALSE}

library(DBI)

con <- dbConnect(

  bigrquery::bigquery(),

  project = "publicdata",

  dataset = "samples",

  billing = billing

)

con 

dbListTables(con)

dbGetQuery(con, sql, n = 10)

```

### dplyr

```{r, message = FALSE}

library(dplyr)

natality <- tbl(con, "natality")

natality %>%

  select(year, month, day, weight_pounds) %>% 

  head(10) %>%

  collect()

```

## Important details

### BigQuery account

To use bigrquery, you'll need a BigQuery project. Fortunately, if you just want to play around with the BigQuery API, it's easy to start with Google's free [public data](https://cloud.google.com/bigquery/public-data) and the [BigQuery sandbox](https://cloud.google.com/bigquery/docs/sandbox). This gives you some fun data to play with along with enough free compute (1 TB of queries & 10 GB of storage per month) to learn the ropes. 

To get started, open  and create a project. Make a note of the "Project ID" as you'll use this as the `billing` project whenever you work with free sample data; and as the `project` when you work with your own data.

### Authentication and authorization

When using bigrquery interactively, you'll be prompted to [authorize bigrquery](https://cloud.google.com/bigquery/docs/authorization) in the browser. You'll be asked if you want to cache tokens for reuse in future sessions. For non-interactive usage, it is preferred to use a service account token, if possible. More places to learn about auth:

  * Help for [`bigrquery::bq_auth()`](https://bigrquery.r-dbi.org/reference/bq_auth.html).

  * [How gargle gets tokens](https://gargle.r-lib.org/articles/how-gargle-gets-tokens.html).

    - bigrquery obtains a token with `gargle::token_fetch()`, which supports

      a variety of token flows. This article provides full details, such as how

      to take advantage of Application Default Credentials or service accounts

      on GCE VMs.

  * [Non-interactive auth](https://gargle.r-lib.org/articles/non-interactive-auth.html). Explains

    how to set up a project when code must run without any user interaction.

  * [How to get your own API credentials](https://gargle.r-lib.org/articles/get-api-credentials.html). Instructions for getting your own OAuth client or service account token.

Note that bigrquery requests permission to modify your data; but it will never do so unless you explicitly request it (e.g. by calling `bq_table_delete()` or `bq_table_upload()`). Our [Privacy policy](https://www.tidyverse.org/google_privacy_policy) provides more info.

## Useful links

* [SQL reference](https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators)

* [API reference](https://cloud.google.com/bigquery/docs/reference/rest)

* [Query/job console](https://console.cloud.google.com/bigquery/)

* [Billing console](https://console.cloud.google.com/)

## Policies

Please note that the 'bigrquery' project is released with a [Contributor Code of Conduct](https://bigrquery.r-dbi.org/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.

[Privacy policy](https://www.tidyverse.org/google_privacy_policy)