https://github.com/navicore/navilake
An Akka Streams source of Azure Data Lake data
https://github.com/navicore/navilake
akka akka-streams azure azure-data-lake scala
Last synced: 5 months ago
JSON representation
An Akka Streams source of Azure Data Lake data
- Host: GitHub
- URL: https://github.com/navicore/navilake
- Owner: navicore
- License: mit
- Created: 2018-11-03T15:15:40.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-01-01T03:12:06.000Z (almost 2 years ago)
- Last Synced: 2025-02-17T07:33:48.348Z (8 months ago)
- Topics: akka, akka-streams, azure, azure-data-lake, scala
- Language: Scala
- Size: 280 KB
- Stars: 0
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://travis-ci.org/navicore/navilake)
[](https://www.codacy.com/app/navicore/navilake?utm_source=github.com&utm_medium=referral&utm_content=navicore/navilake&utm_campaign=Badge_Grade)# Read Azure Data Lake Storage into Akka Streams
Replay historical data-at-rest into an
existing code base that had been designed for streaming.## Current Storage Sources
1. GZip files of UTF8 `\n` delimited strings
2. Other storage implementations TBDUses the [adslapi].
## USAGE
update your `build.sbt` dependencies with:
```scala
// https://mvnrepository.com/artifact/tech.navicore/navilake
libraryDependencies += "tech.navicore" %% "navilake" % "1.3.0"
```This example reads gzip data from Azure Data Lake.
Create a config, a connector, and a source via the example below.
```scala
val consumer = ... // some Sink
...
...
...
// credentials and location
implicit val cfg: LakeConfig = LakeConfig(ACCOUNTFQDN, CLIENTID, AUTHEP, CLIENTKEY, Some(PATH))
val connector: ActorRef = actorSystem.actorOf(GzipConnector.props)
val src = NaviLake(connector)
...
...
...
src.runWith(consumer)
...
...
...
```---
[adslapi]:https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-get-started-java-sdk#read-a-file