An open API service indexing awesome lists of open source software.

https://github.com/hrbrmstr/sparrow

Temporary Shorcut For Reading Arrow/Parquet Bits Into R via 'reticulate'
https://github.com/hrbrmstr/sparrow

arrow pandas-dataframe parquet r rstats

Last synced: 8 months ago
JSON representation

Temporary Shorcut For Reading Arrow/Parquet Bits Into R via 'reticulate'

Awesome Lists containing this project

README

          

---
output: rmarkdown::github_document
---

# sparrow

Temporary Shorcut For Reading Arrow/Parquet Bit Into R via 'reticulate'

## Description

Work is being done to make Parquet/Arrow a first-class R citizen
but -- until then -- I don't always want a Drill server round trip just
to read in some data and same goes for firing up a Spark instance (srsly).
So, this is a quick hack until the R packages are done.

## NOTE

**Requires** Python 3.5+, `pyarrow` and `pandas`.

## What's Inside The Tin

The following functions are implemented:

- `read_parquet`: Read in data from Parquet into an R data frame via 'reticulate'

## Installation

```{r eval=FALSE}
devtools::install_github("hrbrmstr/sparrow")
```

```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```

## Usage

```{r message=FALSE, warning=FALSE, error=FALSE}
library(sparrow)

# current verison
packageVersion("sparrow")

```

```{r cache=TRUE}
read_parquet("/tmp/honeypot.parquet")
```

```{r cache=TRUE}
read_parquet("/tmp/honeypot.parquet", c("src", "duration"))
```