{"id":22725432,"url":"https://github.com/ovotech/kafka-serialization","last_synced_at":"2025-04-13T20:33:21.943Z","repository":{"id":20859516,"uuid":"79578945","full_name":"ovotech/kafka-serialization","owner":"ovotech","description":"Lego bricks to build Apache Kafka serializers and deserializers","archived":false,"fork":false,"pushed_at":"2023-09-12T17:22:38.000Z","size":361,"stargazers_count":120,"open_issues_count":43,"forks_count":17,"subscribers_count":39,"default_branch":"master","last_synced_at":"2025-03-27T10:51:21.498Z","etag":null,"topics":["apache-kafka","avro","avro4s","circe","json","json4s","kaluza-to-migrate","serialization","spray-json"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ovotech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-01-20T17:05:26.000Z","updated_at":"2025-02-12T15:24:51.000Z","dependencies_parsed_at":"2023-01-11T21:00:09.822Z","dependency_job_id":null,"html_url":"https://github.com/ovotech/kafka-serialization","commit_stats":null,"previous_names":[],"tags_count":82,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ovotech%2Fkafka-serialization","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ovotech%2Fkafka-serialization/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ovotech%2Fkafka-serialization/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ovotech%2Fkafka-serialization/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ovotech","download_url":"https://codeload.github.com/ovotech/kafka-serialization/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248778168,"owners_count":21160096,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-kafka","avro","avro4s","circe","json","json4s","kaluza-to-migrate","serialization","spray-json"],"created_at":"2024-12-10T16:10:43.911Z","updated_at":"2025-04-13T20:33:21.888Z","avatar_url":"https://github.com/ovotech.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kafka serialization/deserialization building blocks\n\n[![CircleCI Badge](https://circleci.com/gh/ovotech/kafka-serialization.svg?style=shield)](https://circleci.com/gh/ovotech/kafka-serialization)\n[![Codacy Badge](https://api.codacy.com/project/badge/Grade/a2d814f22d4e4facae0f8a3eb1c841fd)](https://www.codacy.com/app/filippo-deluca/kafka-serialization?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=ovotech/kafka-serialization\u0026amp;utm_campaign=Badge_Grade)\n[Download](https://kaluza.jfrog.io/artifactory/maven/com/ovoenergy/kafka-serialization-core_2.12/[RELEASE]/kafka-serialization-core_2.12-[RELEASE].jar)\n\nThe aim of this library is to provide the Lego\u0026trade; bricks to build a serializer/deserializer for kafka messages. \n\nThe serializers/deserializers built by this library cannot be used in the Kafka configuration through properties, but \nneed to be passed through the Kafka Producer/Consumer constructors (It is feature IMHO).\n\nFor the Avro serialization this library uses Avro4s while for JSON it supports Json4s, Circe and Spray out of the box. \nIt is quite easy to add support for other libraries as well.\n\n## Modules\n\nThe library is composed by these modules:\n\n- kafka-serialization-core: provides the serialization primitives to build serializers and deserializers.\n- kafka-serialization-cats: provides cats typeclasses instances for serializers and deserializers.\n- kafka-serialization-json4s: provides serializer and deserializer based on Json4s\n- kafka-serialization-jsoniter-scala: provides serializer and deserializer based on Jsoniter Scala\n- kafka-serialization-spray: provides serializer and deserializer based on Spray Json\n- kafka-serialization-circe: provides serializer and deserializer based on Circe\n- kafka-serialization-avro: provides an schema-registry client settings\n- kafka-serialization-avro4s: provides serializer and deserializer based on Avro4s 1.x\n- kafka-serialization-avro4s2: provides serializer and deserializer based on Avro4s 2.x\n\nThe Avro4s serialization support the schema evolution through the schema registry. The consumer can provide its own schema\nand Avro will take care of the conversion.\n\n## Getting Started\n\n - The library is available in the Kaluza artifactory repository.\n - See [here](https://kaluza.jfrog.io/artifactory/maven/com/ovoenergy/kafka-serialization-core_2.12/) for the latest version.\n - Add this snippet to your build.sbt to use it:\n\n```sbtshell\nimport sbt._\nimport sbt.Keys.\n\nresolvers += \"Artifactory\" at \"https://kaluza.jfrog.io/artifactory/maven\"\n\nlibraryDependencies ++= {\n  val kafkaSerializationV = \"0.5.25\"\n  Seq(\n    \"com.ovoenergy\" %% \"kafka-serialization-core\" % kafkaSerializationV,\n    \"com.ovoenergy\" %% \"kafka-serialization-circe\" % kafkaSerializationV, // To provide Circe JSON support\n    \"com.ovoenergy\" %% \"kafka-serialization-json4s\" % kafkaSerializationV, // To provide Json4s JSON support\n    \"com.ovoenergy\" %% \"kafka-serialization-jsoniter-scala\" % kafkaSerializationV, // To provide Jsoniter Scala JSON support\n    \"com.ovoenergy\" %% \"kafka-serialization-spray\" % kafkaSerializationV, // To provide Spray-json JSON support\n    \"com.ovoenergy\" %% \"kafka-serialization-avro4s\" % kafkaSerializationV // To provide Avro4s Avro support\n  )\n}\n\n```\n\n## Circe example\n\nCirce is a JSON library for Scala that provides support for generic programming trough Shapeless. You can find more \ninformation on the [Circe website](https://circe.github.io/circe).\n\nSimple serialization/deserialization example with Circe:\n\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.circe._\n\n// Import the Circe generic support\nimport io.circe.generic.auto._\nimport io.circe.syntax._\n\nimport org.apache.kafka.clients.producer.KafkaProducer\nimport org.apache.kafka.clients.consumer.KafkaConsumer\nimport org.apache.kafka.clients.CommonClientConfigs._\n\nimport scala.collection.JavaConverters._\n\ncase class UserCreated(id: String, name: String, age: Int)\n\nval producer = new KafkaProducer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava, \n  nullSerializer[Unit], \n  circeJsonSerializer[UserCreated]\n)\n\nval consumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  circeJsonDeserializer[UserCreated]\n)\n```\n\n\n## Jsoniter Scala example\n\n[Jsoniter Scala](https://github.com/plokhotnyuk/jsoniter-scala). is a library that generates codecs for case classes, \nstandard types and collections to get maximum performance of JSON parsing \u0026 serialization.\n\nHere is an example of serialization/deserialization with Jsoniter Scala:\n\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.jsoniter_scala._\n\n// Import the Jsoniter Scala macros \u0026 core support\nimport com.github.plokhotnyuk.jsoniter_scala.macros._\nimport com.github.plokhotnyuk.jsoniter_scala.core._\n\nimport org.apache.kafka.clients.producer.KafkaProducer\nimport org.apache.kafka.clients.consumer.KafkaConsumer\nimport org.apache.kafka.clients.CommonClientConfigs._\n\nimport scala.collection.JavaConverters._\n\ncase class UserCreated(id: String, name: String, age: Int)\n\nimplicit val userCreatedCodec: JsonValueCodec[UserCreated] = JsonCodecMaker.make[UserCreated](CodecMakerConfig)\n\nval producer = new KafkaProducer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava, \n  nullSerializer[Unit],\n  jsoniterScalaSerializer[UserCreated]()\n)\n\nval consumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  jsoniterScalaDeserializer[UserCreated]()\n)\n```\n\n\n## Avro example\n\nApache Avro is a remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses \nJSON for defining data types and protocols, and serializes data in a compact binary format.\n\nApache Avro provide some support to evolve your messages across multiple version without breaking compatibility with \nolder or newer consumers. It supports several encoding formats but two are the most used in Kafka: Binary and Json.\n\nThe encoded data is always validated and parsed using a Schema (defined in JSON) and eventually evolved to the reader \nSchema version.\n\nThis library provided the support to Avro by using the [Avro4s](https://github.com/sksamuel/avro4s) libray. It uses macro\nand shapeless to allowing effortless serialization and deserialization. In addition to Avro4s it need a Confluent schema\nregistry in place, It will provide a way to control the format of the messages produced in kafka. You can find more \ninformation in the [Confluent Schema Registry Documentation ](http://docs.confluent.io/current/schema-registry/docs/).\n\n\nAn example with Avro4s binary and Schema Registry:\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.avro4s._\n\nimport com.sksamuel.avro4s._\n\nimport org.apache.kafka.clients.producer.KafkaProducer\nimport org.apache.kafka.clients.consumer.KafkaConsumer\nimport org.apache.kafka.clients.CommonClientConfigs._\n\nimport scala.collection.JavaConverters._\n\nval schemaRegistryEndpoint = \"http://localhost:8081\"\n\ncase class UserCreated(id: String, name: String, age: Int)\n\n// This type class is need by the avroBinarySchemaIdSerializer\nimplicit val UserCreatedToRecord = ToRecord[UserCreated]\n\nval producer = new KafkaProducer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava, \n  nullSerializer[Unit], \n  avroBinarySchemaIdSerializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = true)\n)\n\n// This type class is need by the avroBinarySchemaIdDeserializer\nimplicit val UserCreatedFromRecord = FromRecord[UserCreated]\n\nval consumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  avroBinarySchemaIdDeserializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = true)\n)\n```\n\n\nThis Avro serializer will try to register the schema every new message type it will serialize and will save the obtained \nschema id in cache. The deserializer will contact the schema registry each time it will encounter a message with a never\nseen before schema id. \n\nThe schema id will encoded in the first 4 bytes of the payload. The deserializer will extract the schema id from the \npayload and fetch the schema from the schema registry. The deserializer is able to evolve the original message to the \nconsumer schema. The use case is when the consumer is only interested in a part of the original message (schema projection) \nor when the original message is in a older or newer format of the cosumer schema (schema evolution).\n\nAn example of the consumer schema:\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.avro4s._\n\nimport com.sksamuel.avro4s._\n\nimport org.apache.kafka.clients.producer.KafkaProducer\nimport org.apache.kafka.clients.consumer.KafkaConsumer\nimport org.apache.kafka.clients.CommonClientConfigs._\n\nimport scala.collection.JavaConverters._\n\nval schemaRegistryEndpoint = \"http://localhost:8081\"\n\n/* Assuming the original message has been serialized using the \n * previously defined UserCreated class. We are going to project\n * it ignoring the value of the age\n */\ncase class UserCreated(id: String, name: String)\n\n// This type class is need by the avroBinarySchemaIdDeserializer\nimplicit val UserCreatedFromRecord = FromRecord[UserCreated]\n\n\n/* This type class is need by the avroBinarySchemaIdDeserializer \n * to obtain the consumer schema\n */\nimplicit val UserCreatedSchemaFor = SchemaFor[UserCreated]\n\nval consumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  avroBinarySchemaIdWithReaderSchemaDeserializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = false)\n)\n```\n\n\n## Format byte\n\nThe Original Confluent Avro serializer/deserializer prefix the payload with a \"magic\" byte to identify that the message \nhas been written with the Avro serializer. \n\nSimilarly this library support the same mechanism by mean of a couple of function. It is even able to multiplex and \ndemultiplex different serializers/deserializers based on that format byte. At the moment the supported formats are\n  - JSON\n  - Avro Binary with schema ID\n  - Avro JSON with schema ID\n\nlet's see this mechanism in action:\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.avro4s._\nimport com.ovoenergy.kafka.serialization.circe._\n\n// Import the Circe generic support\nimport io.circe.generic.auto._\nimport io.circe.syntax._\n\nimport org.apache.kafka.clients.producer.KafkaProducer\nimport org.apache.kafka.clients.consumer.KafkaConsumer\nimport org.apache.kafka.clients.CommonClientConfigs._\nimport scala.collection.JavaConverters._\n\n\nsealed trait Event\ncase class UserCreated(id: String, name: String, email: String) extends Event\n\nval schemaRegistryEndpoint = \"http://localhost:8081\"\n\n/* This producer will produce messages in Avro binary format */\nval avroBinaryProducer = new KafkaProducer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava, \n  nullSerializer[Unit],   \n  formatSerializer(Format.AvroBinarySchemaId, avroBinarySchemaIdSerializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = false))\n)\n\n/* This producer will produce messages in Json format */\nval circeProducer = new KafkaProducer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava, \n  nullSerializer[Unit],   \n  formatSerializer(Format.Json, circeJsonSerializer[UserCreated])\n)\n\n/* This consumer will be able to consume messages from both producer */\nval consumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  formatDemultiplexerDeserializer[UserCreated](unknownFormat =\u003e failingDeserializer(new RuntimeException(\"Unsupported format\"))){\n    case Format.Json =\u003e circeJsonDeserializer[UserCreated]\n    case Format.AvroBinarySchemaId =\u003e avroBinarySchemaIdDeserializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = false)\n  }\n)\n\n/* This consumer will be able to consume messages in Avro binary format with the magic format byte at the start */\nval avroBinaryConsumer = new KafkaConsumer(\n  Map[String, AnyRef](BOOTSTRAP_SERVERS_CONFIG-\u003e\"localhost:9092\").asJava,\n  nullDeserializer[Unit],\n  avroBinarySchemaIdDeserializer[UserCreated](schemaRegistryEndpoint, isKey = false, includesFormatByte = true)\n)\n```\n\n\nYou can notice that the `formatDemultiplexerDeserializer` is little bit nasty because it is invariant in the type `T` so\nall the demultiplexed `serialiazer` must be declared as `Deserializer[T]`.\n\nThere are other support serializer and deserializer, you can discover them looking trough the code and the tests.\n\n## Useful de-serializers\n\nIn the core module there are pleanty of serializers and deserializers that handle generic cases.\n\n### Optional deserializer\n\nTo handle the case in which the data is null, you need to wrap the deserializer in the `optionalDeserializer`:\n\n```scala\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.circe._\n\n// Import the Circe generic support\nimport io.circe.generic.auto._\nimport io.circe.syntax._\n\nimport org.apache.kafka.common.serialization.Deserializer\n\ncase class UserCreated(id: String, name: String, age: Int)\n\nval userCreatedDeserializer: Deserializer[Option[UserCreated]] = optionalDeserializer(circeJsonDeserializer[UserCreated])\n```\n\n## Cats instances\n\nThe `cats` module provides the `Functor` typeclass instance for the `Deserializer` and `Contravariant` instance for the \n`Serializer`. This allow to do:\n\n```scala\nimport cats.implicits._\nimport com.ovoenergy.kafka.serialization.core._\nimport com.ovoenergy.kafka.serialization.cats._\nimport org.apache.kafka.common.serialization.{Serializer, Deserializer, IntegerSerializer, IntegerDeserializer}\n\nval intDeserializer: Deserializer[Int] = (new IntegerDeserializer).asInstanceOf[Deserializer[Int]]\nval stringDeserializer: Deserializer[String] = intDeserializer.map(_.toString)\n \nval intSerializer: Serializer[Int] = (new IntegerSerializer).asInstanceOf[Serializer[Int]]\nval stringSerializer: Serializer[String] = intSerializer.contramap(_.toInt)\n```\n\n## Complaints and other Feedback\n\nFeedback of any kind is always appreciated.\n\nIssues and PR's are welcome as well.\n\n## About this README\n\nThe code samples in this README file are checked using [mdoc](https://github.com/scalameta/mdoc).\n\nThis means that the `README.md` file is generated from `docs/src/README.md`. If you want to make any changes to the README, you should:\n\n1. Edit `docs/src/README.md`\n2. Run `sbt mdoc` to regenerate `./README.md`\n3. Commit both files to git","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fovotech%2Fkafka-serialization","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fovotech%2Fkafka-serialization","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fovotech%2Fkafka-serialization/lists"}