{"id":22480154,"url":"https://github.com/findify/flink-scala-api","last_synced_at":"2025-08-02T14:32:43.820Z","repository":{"id":39796511,"uuid":"490644408","full_name":"findify/flink-scala-api","owner":"findify","description":"A fork of Apache Flink scala bindings for 2.12, 2.13 and 3.x","archived":false,"fork":false,"pushed_at":"2024-04-17T15:08:38.000Z","size":107,"stargazers_count":20,"open_issues_count":12,"forks_count":3,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-05-02T23:38:22.427Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/findify.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-05-10T10:15:56.000Z","updated_at":"2023-12-16T16:12:35.000Z","dependencies_parsed_at":"2023-02-16T00:01:10.144Z","dependency_job_id":null,"html_url":"https://github.com/findify/flink-scala-api","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/findify%2Fflink-scala-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/findify%2Fflink-scala-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/findify%2Fflink-scala-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/findify%2Fflink-scala-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/findify","download_url":"https://codeload.github.com/findify/flink-scala-api/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228483524,"owners_count":17927363,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-06T15:19:55.239Z","updated_at":"2024-12-06T15:19:55.819Z","avatar_url":"https://github.com/findify.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scala 2.12/2.13/3.x API for Apache Flink\n\n[![CI Status](https://github.com/findify/flink-scala-api/workflows/CI/badge.svg)](https://github.com/findify/flink-scala-api/actions)\n[![Maven Central](https://maven-badges.herokuapp.com/maven-central/io.findify/flink-scala-api_2.12/badge.svg?style=plastic)](https://maven-badges.herokuapp.com/maven-central/io.findify/flink-scala-api_2.12)\n[![License: Apache 2](https://img.shields.io/badge/License-Apache2-green.svg)](https://opensource.org/licenses/Apache-2.0)\n![Last commit](https://img.shields.io/github/last-commit/findify/flink-scala-api)\n![Last release](https://img.shields.io/github/release/findify/flink-scala-api)\n\nThis project is a community-maintained fork of official Apache Flink 1.15 scala API, cross-built for scala 2.12, 2.13 and 3.x.\n\n## Differences\n\n### New [magnolia](https://github.com/softwaremill/magnolia)-based serialization framework\n\nOfficial Flink's serialization framework has two important drawbacks complicating the upgrade to Scala 2.13+:\n* it used a complicated `TypeInformation` derivation macro, which required a complete rewrite to work on Scala 3.\n* for serializing a `Traversable[_]` it serialized an actual scala code of the corresponding `CanBuildFrom[_]` builder,\nwhich was compiled and executed on deserialization. There is no more `CanBuildFrom[_]` on Scala 2.13+, so there is\nno easy way of migration\n\nThis project relies on the [Flink-ADT](https://github.com/findify/flink-adt) library to derive serializers for all \ntypes with the following perks:\n* ADT support: so your `sealed trait` members won't fall back to extremely slow Kryo serializer\n* case objects: no more problems with `None`\n* uses implicits (and typeclasses in Scala 3) to customize the serialization\n\nBut there are some drawbacks:\n* Savepoints written using Flink's official serialization API are not compatible, so you need to re-bootstrap your job\nfrom scratch.\n* As serializer derivation happens in a compile-time and uses zero runtime reflection, for deeply-nested rich case\nclasses the compile times are quite high.\n\nSee [Flink-ADT](https://github.com/findify/flink-adt) readme for more details.\n\n### Using a POJO-only flink serialization framework\n\nIf you don't want to use a `Flink-ADT` for serialization for some reasons, you can always fall back to a flink's\n[POJO serializer](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/fault-tolerance/serialization/types_serialization/#rules-for-pojo-types),\nexplicitly calling it:\n```scala\nval env = StreamingExecutionEnvironment.createLocalEnvironment()\nenv\n  .fromCollection(1,2,3)\n  .map(x =\u003e x + 1)(TypeInformation.of[Int]) // explicit call\n```\n\nWith this approach:\n* savepoint compatibility between this and official Flink API\n* slower serialization type due to frequent Kryo fallback\n* larger savepoint size (again, due to Kryo)\n\n### Closure cleaner from Spark 3.x\n\nFlink historically used quite an old forked version of the ClosureCleaner for scala lambdas, which has some minor\ncompatibility issues with Java 17 and Scala 2.13+. This project uses a more recent version, hopefully with less\ncompatibility issues.\n\n### No Legacy DataSet API\n\nSorry, but it's already deprecated and as a community project we have no resources to support it. If you need it,\nPRs are welcome.\n\n## Migration \n\n`flink-scala-api` uses a different package name for all api-related classes like `DataStream`, so you can do\ngradual migration of a big project and use both upstream and this versions of scala API in the same project. \n\nThe actual migration should be straightforward and simple, replace old import to the new ones:\n```scala\n// original api import\nimport org.apache.flink.streaming.api.scala._\n\n// flink-scala-api imports\nimport io.findify.flink.api._\nimport io.findify.flinkadt.api._\n```\n\n## Usage \n\n`flink-scala-api` is released to Maven-central for 2.12, 2.13 and 3. For SBT, add this snippet to `build.sbt`:\n```scala\nlibraryDependencies += \"io.findify\" %% \"flink-scala-api\" % \"1.15-1\"\n```\n\nWe suggest to remove `flink-scala` and `flink-streaming-scala` dependencies altogether to simplify the migration and\nnot to mix two flavors of API in the same project. But it's technically possible and not required.\n\n## Scala 3\n\nScala 3 support is highly experimental and not well-tested in production. Good thing is that most of the issues are compile-time, \nso quite easy to reproduce. If you have issues with `flink-adt` not deriving `TypeInformation[T]` for the `T` you want, \nsubmit a bug report!\n\n## Compile times\n\nThey may be quite bad for rich nested case classes due to compile-time serializer derivation. \nDerivation happens each time `flink-scala-api` needs an instance of the `TypeInformation[T]` implicit/type class:\n```scala\ncase class Foo(x: Int) {\n  def inc(a: Int) = copy(x = x + a)\n}\n\nval env = StreamingExecutionEnvironment.createLocalEnvironment()\nenv\n  .fromCollection(List(Foo(1),Foo(2),Foo(3)))\n  .map(x =\u003e x.inc(1)) // here the TypeInformation[Foo] is generated\n  .map(x =\u003e x.inc(2)) // generated one more time again\n```\n\nIf you're using the same instances of data structures in multiple jobs (or in multiple tests), consider caching the\nderived serializer in a separate compile unit and just importing it when needed:\n\n```scala\n// file FooTypeInfo.scala\nobject FooTypeInfo {\n  lazy val fooTypeInfo: TypeInformation[Foo] = deriveTypeInformation[Foo]\n}\n\n// file SomeJob.scala\ncase class Foo(x: Int) {\n  def inc(a: Int) = copy(x = x + a)\n}\n\nimport FooTypeInfo._\n\nval env = StreamingExecutionEnvironment.createLocalEnvironment()\nenv\n  .fromCollection(List(Foo(1),Foo(2),Foo(3)))\n  .map(x =\u003e x.inc(1)) // taken as an implicit\n  .map(x =\u003e x.inc(2)) // again, no re-derivation\n\n```\n\n## License\n\nThis project is using parts of the Apache Flink codebase, so the whole project\nis licensed under an [Apache 2.0](LICENSE.md) software license.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffindify%2Fflink-scala-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffindify%2Fflink-scala-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffindify%2Fflink-scala-api/lists"}