Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fsanaulla/chronicler-spark
InfluxDB connector to Apache Spark on top of Chronicler
https://github.com/fsanaulla/chronicler-spark
chronicler dataframe influxdb rdd scala spark streaming
Last synced: 2 months ago
JSON representation
InfluxDB connector to Apache Spark on top of Chronicler
- Host: GitHub
- URL: https://github.com/fsanaulla/chronicler-spark
- Owner: fsanaulla
- License: apache-2.0
- Created: 2018-06-22T14:57:10.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-07-29T15:14:56.000Z (5 months ago)
- Last Synced: 2024-10-15T18:41:04.411Z (3 months ago)
- Topics: chronicler, dataframe, influxdb, rdd, scala, spark, streaming
- Language: Scala
- Homepage:
- Size: 243 KB
- Stars: 27
- Watchers: 3
- Forks: 4
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- Changelog: changelog/0.2.9.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# chronicler-spark
[![Scala CI](https://github.com/fsanaulla/spark-http-rdd/actions/workflows/scala.yml/badge.svg)](https://github.com/fsanaulla/chronicler-spark/actions/workflows/scala.yml)
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.github.fsanaulla/chronicler-spark-core_2.12/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.github.fsanaulla/chronicler-spark-core_2.12)
[![Scala Steward badge](https://img.shields.io/badge/Scala_Steward-helping-blue.svg?style=flat&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA4AAAAQCAMAAAARSr4IAAAAVFBMVEUAAACHjojlOy5NWlrKzcYRKjGFjIbp293YycuLa3pYY2LSqql4f3pCUFTgSjNodYRmcXUsPD/NTTbjRS+2jomhgnzNc223cGvZS0HaSD0XLjbaSjElhIr+AAAAAXRSTlMAQObYZgAAAHlJREFUCNdNyosOwyAIhWHAQS1Vt7a77/3fcxxdmv0xwmckutAR1nkm4ggbyEcg/wWmlGLDAA3oL50xi6fk5ffZ3E2E3QfZDCcCN2YtbEWZt+Drc6u6rlqv7Uk0LdKqqr5rk2UCRXOk0vmQKGfc94nOJyQjouF9H/wCc9gECEYfONoAAAAASUVORK5CYII=)](https://scala-steward.org)Open-source [InfluxDB](https://www.influxdata.com/) connector for [Apache Spark](https://spark.apache.org/index.html) on top of [Chronicler](https://github.com/fsanaulla/chronicler).
## Get Started
At the beginning add required module to your `build.sbt`:
```
// For RDD
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-rdd" %// For Dataset
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-ds" %// For Structured Streaming
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-structured-streaming" %// For DStream
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-streaming" %
```## Usage
Default configuration:
```
final case class InfluxConfig(
host: String,
port: Int = 8086,
credentials: Option[InfluxCredentials] = None,
compress: Boolean = false,
ssl: Boolean = false)
```
It's recommended to enable data compression to decrease network traffic.For `RDD[T]`:
```
import com.github.fsanaulla.chronicler.spark.rdd._val rdd: RDD[T] = _
rdd.saveToInfluxDBMeas("dbName", "measurementName")// to save with dynamicly generated measurement
rdd.saveToInfluxDB("dbName")
```
For `Dataset[T]`:
```
import com.github.fsanaulla.chronicler.spark.ds._val ds: Dataset[T] = _
ds.saveToInfluxDBMeas("dbName", "measurementName")// to save with dynamicly generated measurement
ds.saveToInfluxDB("dbName")```
For `DataStreamWriter[T]`
```import com.github.fsanaulla.chronicler.spark.structured.streaming._
val structStream: DataStreamWriter[T] = _
val saved = structStream.saveToInfluxDBMeas("dbName", "measurementName")// to save with dynamicly generated measurement
val saved = structStream.saveToInfluxDB("dbName")
..
saved.start().awaitTermination()```
For `DStream[T]`:
```
import com.github.fsanaulla.chronicler.spark.streaming._val stream: DStream[T] = _
stream.saveToInfluxDBMeas("dbName", "measurementName")
stream,saveToInfluxDB("dbName")
```