https://github.com/timvw/frameless-ext

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/timvw/frameless-ext
Owner: timvw
License: apache-2.0
Created: 2020-09-11T13:26:57.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2023-12-15T14:39:46.000Z (over 1 year ago)
Last Synced: 2025-03-29T12:51:19.998Z (about 2 months ago)
Language: Scala
Size: 35.2 KB
Stars: 9
Watchers: 3
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

        # Frameless-ext

This library contains additional syntax for [Frameless](https://github.com/typelevel/frameless).

[![Build Status](https://github.com/timvw/frameless-ext/workflows/workflow/badge.svg)](https://github.com/timvw/frameless-ext/workflows/workflow)

[![Maven Central](https://img.shields.io/maven-central/v/be.icteam/frameless-ext_2.12.svg)](https://maven-badges.herokuapp.com/maven-central/be.icteam/frameless-ext_2.12)

## Usage

Import the dependency:

```scala

libraryDependencies += "be.icteam" %% "frameless_ext" % "2.0.0"

```

Enable the additional syntax with the following import statement:

```scala

import be.icteam.frameless.syntax._

```

And now you can create TypedColumns via a simple lambda on a TypeDataSet, eg:

```scala

val tds: TypedDataset[Event] = ???

// the compiler can infer the types ;)

val userColumn = tds.tc(_.user)

val dayColumn = tds.tc(_.day)

```

The available aggregation functions become more discoverable in your IDE as well:

```scala

val result: TypedDataset[(String, Long, Int)] = tds.

  .groupBy(e.tds(_.user))

  .agg(

    tds.tc(_.day).countDistinct,

    tds.tc(_.hour).max)

```

Here is the complete example:

```scala

case class Event(user: String, year: Int, month: Int, day: Int, hour: Int)

object Demo {

  import frameless._

  import frameless.syntax._

  import frameless.functions.aggregate._

  import be.icteam.frameless.syntax._

  import org.apache.log4j.{Level, LogManager}

  import org.apache.spark.sql.SparkSession

  def initSpark: SparkSession = {

    LogManager.getLogger("org").setLevel(Level.ERROR)

    SparkSession

      .builder()

      .appName("demo")

      .master("local[*]")

      .config("spark.ui.enabled", "false")

      .getOrCreate()

  }

  def main(args: Array[String]): Unit = {

    implicit val spark = initSpark

    import spark.implicits._

    val events = spark.createDataset(List(

      Event("tim", 2020, 9, 1, 7),

      Event("tim", 2020, 9, 1, 3),

      Event("tim", 2020, 9, 2, 5),

      Event("tim", 2020, 9, 2, 3),

      Event("tiebe", 2020, 9, 1, 2)

    ))

    val e = TypedDataset.create(events)

    val result: TypedDataset[(String, Long, Long, Int, Long)] = e

      .groupBy(e.tc(_.user))

      .agg(

        count[Event](),

        e.tc(_.day).countDistinct,

        e.tc(_.hour).max,

        e.tc(_.year).sum)

    val job = result.show(10, false)

    job.run()

  }

}

```

## Development

Compile and test:

```bash

sbt +clean; +cleanFiles; +compile; +test

```

Install a snapshot in your local maven repository:

```bash

sbt +publishM2

```

## Release

Set the following environment variables:

- PGP_PASSPHRASE

- PGP_SECRET

- SONATYPE_USERNAME

- SONATYPE_PASSWORD

Leveraging the [ci-release](https://github.com/olafurpg/sbt-ci-release) plugin:

```bash

sbt ci-release

```

Find the most recent release:

```bash

git ls-remote --tags $REPO | \

  awk -F"/" '{print $3}' | \

  grep '^v[0-9]*\.[0-9]*\.[0-9]*' | \

  grep -v {} | \

  sort --version-sort | \

  tail -n1

```

Push a new tag to trigger a release via [travis-ci](https://travis-ci.org/github/timvw/frameless-ext):

```bash

v=v1.0.5

git tag -a $v  -m $v

git push origin $v

```

## License

Code is provided under the Apache 2.0 license available at http://opensource.org/licenses/Apache-2.0, as well as in the LICENSE file. This is the same license used as Spark and Frameless.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/timvw/frameless-ext

Awesome Lists containing this project

README