https://github.com/nathaneastwood/sparkts

sparklyr interface to the spark-ts package
https://github.com/nathaneastwood/sparkts

r sparklyr

Last synced: 2 months ago
JSON representation

sparklyr interface to the spark-ts package

Host: GitHub
URL: https://github.com/nathaneastwood/sparkts
Owner: nathaneastwood
License: apache-2.0
Created: 2018-02-26T14:05:30.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-03-16T13:12:45.000Z (over 7 years ago)
Last Synced: 2024-11-18T05:38:39.909Z (8 months ago)
Topics: r, sparklyr
Language: R
Homepage:
Size: 48.8 MB
Stars: 4
Watchers: 3
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

awesome-sparklyr - sparkts: sparklyr interface to the spark-ts package

README

        ---

output: github_document

---

```{r, echo = FALSE, message = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#",

  fig.path = "tools/images/README-"

)

library(sparkts)

```

# sparkts

[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)

[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/sparkts)](http://cran.r-project.org/package=sparkts)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

The goal of `sparkts` is to provide a test bed of `sparklyr` extensions for the [`spark-ts`](https://github.com/srussell91/SparkTS) framework which was modified from the [`spark-timeseries`](https://github.com/sryza/spark-timeseries) framework.

## Installation

You can install `sparkts` from GitHub with:

```{r installation, eval = FALSE}

# install.packages("devtools")

devtools::install_github("nathaneastwood/sparkts")

```

For details on how to set up for further developing the package, please see the development vignette.

## Example

This is a basic example which shows you how to calculate the standard error for some time series data:

```{r example, cache = TRUE, message = FALSE}

library(sparkts)

# Set up a spark connection

sc <- sparklyr::spark_connect(

  master = "local",

  version = "2.2.0",

  config = list(sparklyr.gateway.address = "127.0.0.1")

)

# Extract some data

std_data <- spark_read_json(

  sc,

  "std_data",

  path = system.file(

    "data_raw/StandardErrorDataIn.json",

    package = "sparkts"

  )

) %>%

  spark_dataframe()

# Call the method

p <- sdf_standard_error(

  sc = sc, data = std_data,

  x_col = "xColumn", y_col = "yColumn", z_col = "zColumn",

  new_column_name = "StandardError"

)

p %>% dplyr::collect()

# Disconnect from the spark connection

spark_disconnect(sc = sc)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nathaneastwood/sparkts

Awesome Lists containing this project

README