https://github.com/eliias/gleam
Fun DSL for Apache Beam and Kotlin.
https://github.com/eliias/gleam
apache-beam data-engineering stream-processing
Last synced: 8 months ago
JSON representation
Fun DSL for Apache Beam and Kotlin.
- Host: GitHub
- URL: https://github.com/eliias/gleam
- Owner: eliias
- License: mit
- Created: 2020-08-20T23:32:22.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-01-02T14:02:13.000Z (over 3 years ago)
- Last Synced: 2025-03-10T23:24:24.648Z (over 1 year ago)
- Topics: apache-beam, data-engineering, stream-processing
- Language: Kotlin
- Homepage:
- Size: 2.72 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gleam
> Fun with Apache Beam.
## gleam-core
Extends the Apache Beam library with convenience functions, mostly to better
utilize type inference in Kotlin.
### `transform`
A convenience function to create a new transform with one or more steps.
```kotlin
val newTransform = transform {
it.apply(SomeTransform())
}
```
A default implementation without this helper would look more like this:
```kotlin
val newTransform = object : PTransform, POutput>() {
override fun expand(input: PCollection) = input.apply(SomeTransform)
}
```
## gleam-flow
A DSL that utilizes Generics, Scope Functions, and Type Inference to provide the
transforms from the Apache Beam library.
```kotlin
val pipeline = Pipeline.create()
pipeline
.sequence()
.window(FixedWindows.of(Duration.standardSeconds(1)))
.map { it * 2 }
.filter { it % 2 == 0L }
.key { it / 10 }
.groupByKey()
.sink(Log.info())
pipeline.run()
```
## gleam-samples
### 1. Batch
This demo's batching of all elements of a bundle. This is a very useful
technique for IO heavy DoFn's.
Example: An upstream transform provides a collection of record keys, and the
downstream transform needs to fetch the records from a persistent datastore, it
is usually better to batch multiple keys into a single request.
### 2. DSL
A Kotlin specific Apache Beam DSL that allows applying transforms in a direct
way onto collections.
### 3. Schema Extensions
Add metadata to schemas or single fields. The metadata and hints are stored as
schema options. Directly with the schema and field definition.
```kotlin
val schema = schema {
version("1.0")
// base
field("id", INT64.withNullable(true)) {
// adds the "version" extension
version("1.0")
}
fields(
Field.of("title", STRING),
Field.of("description", STRING)
) {
// adds the "compress" extension to all fields!
compress()
}
}
```
### 4. POJO's
Demo's the usage of POJO's (Beans) and field schemas in Beam with Kotlin
### 6. SDF
A running version of the SDF example from the Apache Beam website.
### 14. Resources
Inversion of Control and Dependency Injection with Apache Beam, Kotlin and Koin.
### 20. Generator
A demo of the `gleam-core` Generator to create fake data for a pipeline.
### 23. Bigtable
A demo of the `BigtableIO.write` transform.