https://github.com/fsanaulla/chronicler-spark
InfluxDB connector to Apache Spark on top of Chronicler
https://github.com/fsanaulla/chronicler-spark
chronicler dataframe influxdb rdd scala spark streaming
Last synced: 6 months ago
JSON representation
InfluxDB connector to Apache Spark on top of Chronicler
- Host: GitHub
- URL: https://github.com/fsanaulla/chronicler-spark
- Owner: fsanaulla
- License: apache-2.0
- Created: 2018-06-22T14:57:10.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-07-29T15:14:56.000Z (over 1 year ago)
- Last Synced: 2025-04-13T02:18:39.491Z (9 months ago)
- Topics: chronicler, dataframe, influxdb, rdd, scala, spark, streaming
- Language: Scala
- Homepage:
- Size: 243 KB
- Stars: 27
- Watchers: 2
- Forks: 4
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- Changelog: changelog/0.2.9.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# chronicler-spark
[](https://github.com/fsanaulla/chronicler-spark/actions/workflows/scala.yml)
[](https://maven-badges.herokuapp.com/maven-central/com.github.fsanaulla/chronicler-spark-core_2.12)
[](https://scala-steward.org)
Open-source [InfluxDB](https://www.influxdata.com/) connector for [Apache Spark](https://spark.apache.org/index.html) on top of [Chronicler](https://github.com/fsanaulla/chronicler).
## Get Started
At the beginning add required module to your `build.sbt`:
```
// For RDD
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-rdd" %
// For Dataset
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-ds" %
// For Structured Streaming
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-structured-streaming" %
// For DStream
libraryDependencies += "com.github.fsanaulla" %% "chronicler-spark-streaming" %
```
## Usage
Default configuration:
```
final case class InfluxConfig(
host: String,
port: Int = 8086,
credentials: Option[InfluxCredentials] = None,
compress: Boolean = false,
ssl: Boolean = false)
```
It's recommended to enable data compression to decrease network traffic.
For `RDD[T]`:
```
import com.github.fsanaulla.chronicler.spark.rdd._
val rdd: RDD[T] = _
rdd.saveToInfluxDBMeas("dbName", "measurementName")
// to save with dynamicly generated measurement
rdd.saveToInfluxDB("dbName")
```
For `Dataset[T]`:
```
import com.github.fsanaulla.chronicler.spark.ds._
val ds: Dataset[T] = _
ds.saveToInfluxDBMeas("dbName", "measurementName")
// to save with dynamicly generated measurement
ds.saveToInfluxDB("dbName")
```
For `DataStreamWriter[T]`
```
import com.github.fsanaulla.chronicler.spark.structured.streaming._
val structStream: DataStreamWriter[T] = _
val saved = structStream.saveToInfluxDBMeas("dbName", "measurementName")
// to save with dynamicly generated measurement
val saved = structStream.saveToInfluxDB("dbName")
..
saved.start().awaitTermination()
```
For `DStream[T]`:
```
import com.github.fsanaulla.chronicler.spark.streaming._
val stream: DStream[T] = _
stream.saveToInfluxDBMeas("dbName", "measurementName")
stream,saveToInfluxDB("dbName")
```