An open API service indexing awesome lists of open source software.

https://github.com/navicore/navilake

An Akka Streams source of Azure Data Lake data
https://github.com/navicore/navilake

akka akka-streams azure azure-data-lake scala

Last synced: 5 months ago
JSON representation

An Akka Streams source of Azure Data Lake data

Awesome Lists containing this project

README

          

[![Build Status](https://travis-ci.org/navicore/navilake.svg?branch=master)](https://travis-ci.org/navicore/navilake)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/1901174b92304a8d98ce2d8b64f4d9dc)](https://www.codacy.com/app/navicore/navilake?utm_source=github.com&utm_medium=referral&utm_content=navicore/navilake&utm_campaign=Badge_Grade)

# Read Azure Data Lake Storage into Akka Streams

Replay historical data-at-rest into an
existing code base that had been designed for streaming.

## Current Storage Sources
1. GZip files of UTF8 `\n` delimited strings
2. Other storage implementations TBD

Uses the [adslapi].

## USAGE

update your `build.sbt` dependencies with:

```scala
// https://mvnrepository.com/artifact/tech.navicore/navilake
libraryDependencies += "tech.navicore" %% "navilake" % "1.3.0"
```

This example reads gzip data from Azure Data Lake.

Create a config, a connector, and a source via the example below.

```scala
val consumer = ... // some Sink
...
...
...
// credentials and location
implicit val cfg: LakeConfig = LakeConfig(ACCOUNTFQDN, CLIENTID, AUTHEP, CLIENTKEY, Some(PATH))
val connector: ActorRef = actorSystem.actorOf(GzipConnector.props)
val src = NaviLake(connector)
...
...
...
src.runWith(consumer)
...
...
...
```

---

[adslapi]:https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-get-started-java-sdk#read-a-file