Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/fsanaulla/chronicler-spark

InfluxDB connector to Apache Spark on top of Chronicler
https://github.com/fsanaulla/chronicler-spark

chronicler dataframe influxdb rdd scala spark streaming

Last synced: 3 months ago
JSON representation

InfluxDB connector to Apache Spark on top of Chronicler

Host: GitHub
URL: https://github.com/fsanaulla/chronicler-spark
Owner: fsanaulla
License: apache-2.0
Created: 2018-06-22T14:57:10.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-07-29T15:14:56.000Z (6 months ago)
Last Synced: 2024-10-15T18:41:04.411Z (4 months ago)
Topics: chronicler, dataframe, influxdb, rdd, scala, spark, streaming
Language: Scala
Homepage:
Size: 243 KB
Stars: 27
Watchers: 3
Forks: 4
Open Issues: 15
Metadata Files:
- Readme: README.md
- Changelog: changelog/0.2.9.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

        # chronicler-spark

[![Scala CI](https://github.com/fsanaulla/spark-http-rdd/actions/workflows/scala.yml/badge.svg)](https://github.com/fsanaulla/chronicler-spark/actions/workflows/scala.yml)

[![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.github.fsanaulla/chronicler-spark-core_2.12/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.github.fsanaulla/chronicler-spark-core_2.12)

[![Scala Steward badge](https://img.shields.io/badge/Scala_Steward-helping-blue.svg?style=flat&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA4AAAAQCAMAAAARSr4IAAAAVFBMVEUAAACHjojlOy5NWlrKzcYRKjGFjIbp293YycuLa3pYY2LSqql4f3pCUFTgSjNodYRmcXUsPD/NTTbjRS+2jomhgnzNc223cGvZS0HaSD0XLjbaSjElhIr+AAAAAXRSTlMAQObYZgAAAHlJREFUCNdNyosOwyAIhWHAQS1Vt7a77/3fcxxdmv0xwmckutAR1nkm4ggbyEcg/wWmlGLDAA3oL50xi6fk5ffZ3E2E3QfZDCcCN2YtbEWZt+Drc6u6rlqv7Uk0LdKqqr5rk2UCRXOk0vmQKGfc94nOJyQjouF9H/wCc9gECEYfONoAAAAASUVORK5CYII=)](https://scala-steward.org)

Open-source [InfluxDB](https://www.influxdata.com/) connector for [Apache Spark](https://spark.apache.org/index.html) on top of [Chronicler](https://github.com/fsanaulla/chronicler).

## Get Started

At the beginning add required module to your `build.sbt`:

```

// For RDD

libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-rdd" % 

// For Dataset

libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-ds" % 

// For Structured Streaming

libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-structured-streaming" % 

// For DStream

libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-streaming" % 

```

## Usage

Default configuration: 

```

final case class InfluxConfig(

    host: String,

    port: Int = 8086,

    credentials: Option[InfluxCredentials] = None,

    compress: Boolean = false,

    ssl: Boolean = false)

```

It's recommended to enable data compression to decrease network traffic.

For `RDD[T]`:

```

import com.github.fsanaulla.chronicler.spark.rdd._

val rdd: RDD[T] = _

rdd.saveToInfluxDBMeas("dbName", "measurementName")

// to save with dynamicly generated measurement

rdd.saveToInfluxDB("dbName")

```

For `Dataset[T]`:

```

import com.github.fsanaulla.chronicler.spark.ds._

val ds: Dataset[T] = _

ds.saveToInfluxDBMeas("dbName", "measurementName")

// to save with dynamicly generated measurement

ds.saveToInfluxDB("dbName")

```

For `DataStreamWriter[T]`

```

import com.github.fsanaulla.chronicler.spark.structured.streaming._

val structStream: DataStreamWriter[T] = _

val saved = structStream.saveToInfluxDBMeas("dbName", "measurementName")

// to save with dynamicly generated measurement

val saved = structStream.saveToInfluxDB("dbName")

..

saved.start().awaitTermination()

```

For `DStream[T]`:

```

import com.github.fsanaulla.chronicler.spark.streaming._

val stream: DStream[T] = _

stream.saveToInfluxDBMeas("dbName", "measurementName")

stream,saveToInfluxDB("dbName")

```