{"id":25558924,"url":"https://github.com/davewm/willa","last_synced_at":"2025-04-06T02:09:06.080Z","repository":{"id":47898392,"uuid":"167238804","full_name":"DaveWM/willa","owner":"DaveWM","description":"A Clojure DSL for Kafka Streams","archived":false,"fork":false,"pushed_at":"2023-01-04T12:17:20.000Z","size":119,"stargazers_count":138,"open_issues_count":2,"forks_count":13,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-30T01:07:32.224Z","etag":null,"topics":["clojure","kafka","kafka-streams"],"latest_commit_sha":null,"homepage":null,"language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DaveWM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-23T19:10:56.000Z","updated_at":"2024-05-31T07:44:57.000Z","dependencies_parsed_at":"2023-02-02T12:02:18.765Z","dependency_job_id":null,"html_url":"https://github.com/DaveWM/willa","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaveWM%2Fwilla","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaveWM%2Fwilla/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaveWM%2Fwilla/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaveWM%2Fwilla/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DaveWM","download_url":"https://codeload.github.com/DaveWM/willa/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247423515,"owners_count":20936626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","kafka","kafka-streams"],"created_at":"2025-02-20T16:26:18.471Z","updated_at":"2025-04-06T02:09:06.057Z","avatar_url":"https://github.com/DaveWM.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# willa [![CircleCI](https://circleci.com/gh/DaveWM/willa.svg?style=svg)](https://circleci.com/gh/DaveWM/willa) [![Clojars Project](https://img.shields.io/clojars/v/willa.svg)](https://clojars.org/willa)\n\nWilla provides a data-driven DSL on top of\nthe [Kafka Streams DSL](https://docs.confluent.io/current/streams/developer-guide/dsl-api.html), inspired\nby [Onyx](http://www.onyxplatform.org). It uses [Jackdaw](https://github.com/FundingCircle/jackdaw) under the hood.\n\nWilla is named after [Willa Muir](https://en.wikipedia.org/wiki/Willa_Muir), who translated Kafka's \"The Metamorphosis\".\nHer husband Edwin was also involved, but [apparently he \"only helped\"](https://en.wikipedia.org/wiki/Willa_Muir).\n\n## Rationale\n\nThe Kafka Streams DSL is very \"Javaish\".\nIt uses a `KStreamsBuilder` object to build topologies, which operates by in-place mutation.\nThis has all the [usual disadvantages](https://clojure.org/about/state#_object_oriented_programming_oo) of mutability,\nincluding making topologies difficult to compose, test, and visualise.\nThe built topology is represented as a `ProcessorTopology` object, which can theoretically be used to manipulate the\ntopology. However, it is extremely difficult to work with in practice - not least because the `ProcessorTopology` class\nisn't documented.\nThe `KStreamsBuilder` API also re-implements many of the stateless transformation functions from the core Clojure\nlibrary (`map`, `filter`, `mapcat`, etc.), encouraging needless code duplication.\n\nWilla aims to provide an immutable, data-driven DSL (inspired by [Onyx](http://www.onyxplatform.org)) on top of the\nKafka Streams DSL.\nIt represents all aspects of your topology as Clojure data structures and functions.\nThis makes topologies far easier to manipulate and compose. For example, if you want to log every message that is\npublished to output topics,\nyou can write a generic pure function to transform a Willa topology to achieve this.\nIt also enables you to visualise your topology using GraphViz, which is very useful for reasoning about how a topology\nworks, and also for documentation purposes.\n\nWilla uses transducers for stateless transformations, as opposed to a separate API like with the `KStreamsBuilder`.\nTransducers are far more composable, and allow you to re-use code far more effectively.\nThey also enable you to test your transformation logic completely independently of Kafka (and Willa).\n\nWilla also provides a mechanism for experimenting with a topology from the repl, and seeing how data flows through it.\nIt can also be used for unit testing. This mechanism is similar in scope to Kafka's `TestTopologyDriver`, but has a few\nadvantages:\n\n1. It gives you the output data of each individual `KStream`/`KTable`/topic within your topology, instead of just the\n   data on the output topics.\n2. It enables you to visualise the data flow using GraphViz.\n3. It is faster, and doesn't persist anything on disk.\n\n## Getting Started\n\nWilla represents your topology as a map, containing 3 keys:\n\n* `:entities` - an entity is a map containing information about a topic, `KStream`, or `KTable`. The `:entities` map is\n  a map of identifier to entity.\n* `:workflow` - a vector of tuples of `[input-entity-id output-entity-id]`, similar to\n  a [workflow in Onyx](http://www.onyxplatform.org/docs/cheat-sheet/latest/#job/:workflow).\n* `:joins` - this is a map representing all the joins/merges in your topology as data. It is a map of a vector of entity\n  names involved in the join, to a join config.\n\nThis may sound confusing, but let's try to clear things up with a simple example.\nBefore we start, make sure you have a Kafka broker running locally, either using\nthe [Confluent distribution](https://www.confluent.io/product/confluent-platform)\nor [Landoop's fast-data-dev docker image](https://github.com/Landoop/fast-data-dev).\nAlso, if you don't have an existing application, create one by running `lein new my-cool-app`.\n\nSay we want a topology that simply reads messages from an input topic, increments the value, then writes to an output\ntopic.\nThe topology would look like this:\n\n![Simple Topology](resources/simple-topology.png)\n\nStart by adding willa to your `project.clj` or `deps.edn`. Check the latest version on Clojars or in the badge at the\ntop of this readme.\n\nNext, we'll require some necessary namespaces:\n\n```clojure\n(ns my-cool-app.core\n  (:require [jackdaw.streams :as streams]\n    [jackdaw.serdes.edn :as serdes.edn]\n    [willa.core :as w]))\n```\n\nWe then create the workflow like so:\n\n```clojure\n(def workflow\n  [[:input-topic :increment-stream]\n   [:increment-stream :output-topic]])\n```\n\nYou can see that data will flow from the `:input-topic` to the `:increment-stream`, then from `:increment-stream` to\nthe `:output-topic`.\nNow we need to tell Willa what exactly the `:input-topic`, `:increment-stream` and `:output-topic` entities are.\nTo do this, we'll create the entity config map. It looks like this:\n\n```clojure\n(def entities\n  {:input-topic      {::w/entity-type     :topic\n                      :topic-name         \"input-topic\"\n                      :replication-factor 1\n                      :partition-count    1\n                      :key-serde          (serdes.edn/serde)\n                      :value-serde        (serdes.edn/serde)}\n   :increment-stream {::w/entity-type   :kstream\n                      :willa.core/xform (map (fn [[k v]] [k (inc v)])) ;; Note that the mapping function expects a key-value tuple\n                      }\n   :output-topic     {::w/entity-type     :topic\n                      :topic-name         \"output-topic\"\n                      :replication-factor 1\n                      :partition-count    1\n                      :key-serde          (serdes.edn/serde)\n                      :value-serde        (serdes.edn/serde)}})\n```\n\nThat's all the data Willa needs to build your topology! To get our topology up and running, we'll follow these steps:\n\n1. Create a `KStreamsBuilder` object\n2. Call the `willa.core/build-topology!` function, passing it the builder, workflow, and entities\n3. Create a `KafkaStreams` object from the builder\n4. Call `start` on it\n\nThe code looks like this:\n\n```clojure\n(def app-config\n  {\"application.id\"            \"my-cool-app\"\n   \"bootstrap.servers\"         \"localhost:9092\"\n   \"cache.max.bytes.buffering\" \"0\"})\n\n(def topology\n  {:workflow workflow\n   :entities entities})\n\n(defn start! []\n  (let [builder (doto (streams/streams-builder) ;; step 1\n                  (w/build-topology! topology)) ;; step 2\n        kstreams-app (streams/kafka-streams builder app-config) ;; step 3\n        ]\n    (streams/start kstreams-app) ;; step 4\n    kstreams-app))\n```\n\nYou can verify that it works by running the following commands in your repl:\n\n```clojure\n(require 'jackdaw.client\n         'jackdaw.admin\n         'willa.streams)\n\n(def admin-client (jackdaw.admin/-\u003eAdminClient app-config))\n;; create the input and output topics\n(jackdaw.admin/create-topics! admin-client [(:input-topic entities) (:output-topic entities)])\n\n;; start the topology\n(def kstreams-app (start!))\n\n;; create a Kafka Producer, and produce a message with value 1 to the input topic\n(def producer (jackdaw.client/producer app-config\n                                       willa.streams/default-serdes))\n@(jackdaw.client/send! producer (jackdaw.data/-\u003eProducerRecord (:input-topic entities) \"key\" 1))\n\n;; create a Kafka Consumer, and consume everything from the output topic                                     \n(def consumer (jackdaw.client/consumer (assoc app-config \"group.id\" \"consumer\")\n                                       willa.streams/default-serdes))\n(jackdaw.client/subscribe consumer [(:output-topic entities)])\n(jackdaw.client/seek-to-beginning-eager consumer)\n\n;; should return something like: [{:key \"key\" :value 2}] \n(-\u003e\u003e (jackdaw.client/poll consumer 200)\n     (map #(select-keys % [:key :value])))                                           \n```\n\n## Going Further\n\nOne of the cool features of Willa is that you can visualise your topology.\nTo do this, first make sure you have [graphviz installed](https://bit.ly/2MPXzSO),\nthen run these commands in your repl:\n\n```clojure\n(require 'willa.viz)\n\n(willa.viz/view-topology topology)\n```\n\nA diagram of your topology should pop up in a separate window.\n\nYou can also use Willa to experiment with your Topology.\nFor instance, you might want to know what would happen if you receive a message with value `1`.\nTo do this, we'll use the `run-experiment` function in the `willa.experiment` namespace.\nThis function takes a `topology` and a map of entity id to records.\nEach record must contain the `:key`, `:value`, and `:timestamp` keys.\nThe code looks like this:\n\n```clojure\n(require 'willa.experiment)\n\n(def experiment-results\n  ;; should return the topology map, but with each entity updated with a :willa.experiment/output key\n  (willa.experiment/run-experiment topology\n                                   {:input-topic [{:key       \"some-key\"\n                                                   :value     1\n                                                   :timestamp 0}]}))\n\n;; you can now visualise how data flows through the topology in a diagram\n(willa.viz/view-topology experiment-results)                                              \n```\n\n## Reference\n\n### Entity Config\n\n| Key                                   | Required? | Valid Entity Types | Description                                                                                                                                                                                                                                                                                                                                                                                                              |\n|---------------------------------------| --- | --- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `:willa.core/entity-type`             | ☑ | All | The type of the entity. Can be one of: `:topic`, `:kstream`, `:ktable`, or `:global-ktable`                                                                                                                                                                                                                                                                                                                              | \n| `:topic-name`                         | ☑ | `:topic` | The name of the topic                                                                                                                                                                                                                                                                                                                                                                                                    |\n| `:key-serde`                          | ☑ | `:topic` | The serde to use to serialize/deserialize the keys of records on the topic                                                                                                                                                                                                                                                                                                                                               |\n| `:value-serde`                        | ☑ | `:topic` | The serde to use to serialize/deserialize the values of records on the topic                                                                                                                                                                                                                                                                                                                                             |\n| `:willa.core/xform`                   | ❌ | `:kstream` | A transducer to apply to the `KStream`                                                                                                                                                                                                                                                                                                                                                                                   |\n| `:willa.core/group-by-fn`             | ❌ | `:ktable` | A function which takes a key-value pair, and returns the key of the group. If this key is present, `:willa.core/aggregate-adder-fn` and `:willa.core/aggregate-initial-value` must also be provided.                                                                                                                                                                                                                     |\n| `:willa.core/window`                  | ❌ | `:ktable` | The windowing to apply after grouping the input records. Should be either a [Windows](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/Windows.html) or a [SessionWindows](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/SessionWindows.html) object. If this key is present, `:willa.core/group-by` must also be provided. Will cause the input to be coerced to a `KStream` |\n| `:willa.core/aggregate-initial-value` | ❌ | `:ktable` | The initial value to use in an aggregation. Must be provided if `:willa.core/aggregate-adder-fn` is present                                                                                                                                                                                                                                                                                                              |\n| `:willa.core/aggregate-adder-fn`      | ❌ | `:ktable` | The aggregator function if the input is a `KStream`, or the [\"adder\" function](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KGroupedTable.html#aggregate-org.apache.kafka.streams.kstream.Initializer-org.apache.kafka.streams.kstream.Aggregator-org.apache.kafka.streams.kstream.Aggregator) if it is a `KTable`. If this key is present, `:willa.core/group-by` must also be provided.        |\n| `:willa.core/aggregate-subtractor-fn` | ❌ | `:ktable` | The aggregate [\"subtractor\" function](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KGroupedTable.html#aggregate-org.apache.kafka.streams.kstream.Initializer-org.apache.kafka.streams.kstream.Aggregator-org.apache.kafka.streams.kstream.Aggregator-), only valid if the input is a `KTable`. If this key is present, `:willa.core/group-by` must also be provided.                             |\n| `:willa.core/suppression`             | ❌ | `:ktable` | A [Suppressed](https://docs.confluent.io/current/streams/javadocs/org/apache/kafka/streams/kstream/Suppressed.html) object that determines how updates to the `KTable` are emitted. See [the Kafka Streams docs](https://docs.confluent.io/current/streams/javadocs/org/apache/kafka/streams/kstream/KTable.html#suppress-org.apache.kafka.streams.kstream.Suppressed-) for more info                                    |\n| `:willa.core/store-name`              | ❌ | `:ktable` | The local state store name to use for the KTable.                                                                                                                                                                                                                                                                                                                                                                        |\n| `willa.overrides/prevent-repartition` | ❌ | `kstream` | Set to `true` to prevent a repartition of the input topic. This prevents the provided `xform` from changing the message key. Setting this override will cause problems if the `xform` _does_ change the message key - use with caution!                                                                                                                                                                                  |\n\n### Join Config\n\n| Key | Description |\n| --- | --- |\n| `:willa.core/join-type` | The type of the join. Can be one of `:merge`, `:left`, `:inner` or `:outer`.|\n| `:willa.core/window` | A [JoinWindows](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html) object that specifies the windowing to be used in the join. Not used when the join type is `:merge` or when joining a `:kstream` and a `:global-ktable` |     \n| `:willa.core/kv-mapper` | The [kv-mapper function](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KeyValueMapper.html) to use when joining a `:kstream` with a `:global-ktable`. Extracts a key of the GlobalKTable from each `[k v]` of the stream. If not specified, the key of the stream is used. Not used when joining other kinds of objects. See [the Kafka Streams docs](https://kafka.apache.org/20/javadoc/org/apache/kafka/streams/kstream/KStream.html#join-org.apache.kafka.streams.kstream.GlobalKTable-org.apache.kafka.streams.kstream.KeyValueMapper-org.apache.kafka.streams.kstream.ValueJoiner-) for more info |     \n\n## License\n\nThis program and the accompanying materials are made available under the\nterms of the GPL V3 license, which is available at https://www.gnu.org/licenses/gpl-3.0.en.html.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavewm%2Fwilla","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavewm%2Fwilla","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavewm%2Fwilla/lists"}