{"id":20569528,"url":"https://github.com/novolabs/gregor","last_synced_at":"2025-04-14T16:35:25.753Z","repository":{"id":62433854,"uuid":"165867579","full_name":"NovoLabs/gregor","owner":"NovoLabs","description":"Clojure interface to Kafka","archived":false,"fork":false,"pushed_at":"2020-01-07T21:33:26.000Z","size":118,"stargazers_count":7,"open_issues_count":1,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-15T16:38:06.236Z","etag":null,"topics":["clojure","gregor","kafka","stream","streaming"],"latest_commit_sha":null,"homepage":null,"language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NovoLabs.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-15T14:44:55.000Z","updated_at":"2023-07-17T05:57:24.000Z","dependencies_parsed_at":"2022-11-01T21:01:12.078Z","dependency_job_id":null,"html_url":"https://github.com/NovoLabs/gregor","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NovoLabs%2Fgregor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NovoLabs%2Fgregor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NovoLabs%2Fgregor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NovoLabs%2Fgregor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NovoLabs","download_url":"https://codeload.github.com/NovoLabs/gregor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224876884,"owners_count":17384703,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","gregor","kafka","stream","streaming"],"created_at":"2024-11-16T05:08:40.131Z","updated_at":"2024-11-16T05:08:40.933Z","avatar_url":"https://github.com/NovoLabs.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gregor\n\n[![PyPI](https://img.shields.io/pypi/l/Django.svg?style=plastic)]()\n[![CircleCI](https://circleci.com/gh/NovoLabs/gregor/tree/master.svg?style=svg)](https://circleci.com/gh/NovoLabs/gregor/tree/master)\n\n\n\u003e As Gregor Samsa awoke one morning from uneasy dreams he found himself transformed in his bed into a gigantic insect.\n\n― Franz Kafka, _The Metamorphosis_\n\n## Description\n\nGregor provides a channel-based API for asynchronously producing and consuming messages from [Apache Kafka](https://kafka.apache.org/).  Through the use of transducers, Gregor allows the user to transform data during production or consumption or both.\n\nGregor provides a control channel for interacting with the producer and consumer APIs.  Though it is not a complete implementation of the current Kafka producer and consumer objects, the current set of operations supports most use cases.  Additional control operations will be implemented as necessary.\n\n## Inspirations\n\nGregor was inspired by [kinsky](https://github.com/pyr/kinsky) and [ring](https://github.com/ring-clojure).\n\n## Dependencies\n\nSince Gregor is an interface to Kafka, using it requires a Kafka instance.  To help you get up and running quickly, Gregor provides a `docker-compose.yaml` file that can be used to start up a local instance of Kafka.  Assuming you have [`docker-compose`](https://docs.docker.com/compose/install/) installed, you can copy [Gregor's docker compose file](https://github.com/NovoLabs/gregor/blob/master/docker/docker-compose.yaml) from GitHub to your local machine.\n\nOnce you copy Gregor's `docker-compose.yaml` file to your local machine (and `docker-compose` is installed), you can start Kafka with the following command:\n\n```shell\n$ docker-compose up\n```\n\n## Installation\n\nTo install Gregor, add the following to your Leiningen `:dependencies` vector:\n\n```clojure\n[novolabsoss/gregor \"0.1.0\"]\n```\n\n## TL;DR Examples\n\nIf you want to get started quickly, here are some self-contained examples that you can copy-pasta into your REPL.  Please note that the use of `:auto.offset.reset` is *not* recommended for prodution code.  It is used here to prevent lag in the consumer connection from causing the messages created by the producer from being skipped.\n\n### Simple Message Production and Consumption\n\n```clojure\n(require '[clojure.core.async :as a])\n(require '[gregor.consumer :as c])\n(require '[gregor.producer :as p])\n\n;; Create a consumer\n(def consumer (c/create {:output-policy #{:data :control :error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"\n                                               :group.id \"gregor.consumer.test\"\n                                               :auto.offset.reset \"earliest\"}\n                         :topics :gregor.test}))\n\n;; Create a producer\n(def producer (p/create {:output-policy #{:error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"}}))\n\n;; Bind the input channel of the producer to `in-ch`\n(def in-ch (:in-ch producer))\n\n;; Bind the output channel of the consumer to `out-ch`\n(def out-ch (:out-ch consumer))\n\n;; Create a go-loop to print messages received on the output channel of the consumer\n(a/go-loop []\n  (when-let [msg (a/\u003c! out-ch)]\n    (println (pr-str msg))\n    (recur)))\n\n;; Post 2 messages to the input channel of the producer\n(a/\u003e!! in-ch {:topic :gregor.test :message-value {:a 1 :b 2}})\n(a/\u003e!! in-ch {:topic :gregor.test :message-value {:a 3 :b 4}})\n\n;; Close the producer\n(a/\u003e!! (:ctl-ch producer) {:op :close})\n\n;; Close the consumer\n(a/\u003e!! (:ctl-ch consumer) {:op :close})\n```\n\n### Producer Transducer\n\n```clojure\n(require '[clojure.core.async :as a])\n(require '[gregor.consumer :as c])\n(require '[gregor.producer :as p])\n\n;; Function to add a `:producer-timestamp`\n(defn add-timestamp\n  [m]\n  (assoc m :producer-timestamp (.toEpochMilli (java.time.Instant/now)))\n\n;; Function to set the `:message-value`\n(defn add-message-value\n  [m]\n  (-\u003e\u003e (dissoc m :uuid)\n       (assoc m :message-value)))\n\n;; Function to set the `:message-key`\n(defn add-message-key\n  [{:keys [uuid] :as m :or {uuid (java.util.UUID/randomUUID)}}]\n  (-\u003e (assoc m :message-key {:uuid uuid})\n      (dissoc :uuid)))\n\n;; Function to set the `:topic`\n(defn add-topic\n  [topic m]\n  (assoc m :topic topic))\n\t   \n;; Compose the transducer which will reshape the message into something\n;; that the Gregor producer understands\n(def transducer (comp (map add-timestamp)\n                      (map add-message-value)\n                      (map add-message-key)\n                      (map (partial add-topic :gregor.test.send))))\n\n;; Create a consumer\n(def consumer (c/create {:output-policy #{:data :control :error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"\n                                               :group.id \"gregor.consumer.test\"\n                                               :auto.offset.reset \"earliest\"}\n                         :topics :gregor.test}))\n\n;; Create a producer with the transducer from above\n(def producer (p/create {:output-policy #{:error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"}\n\t\t\t\t\t\t :transducer transducer}))\n\n;; Bind the input channel of the producer to `in-ch`\n(def in-ch (:in-ch producer))\n\n;; Bind the output channel of the consumer to `out-ch`\n(def out-ch (:out-ch consumer))\n\n;; Post 2 messages to the input channel of the producer.  Note that they\n;; are not currently of the correct shape.  The transducer on the input\n;; channel will handle this for us\n(a/\u003e!! in-ch {:a 1 :b 2})\n(a/\u003e!! in-ch {:a 3 :b 4})\n\n;; Close the producer\n(a/\u003e!! (:ctl-ch producer) {:op :close})\n\n;; Close the consumer\n(a/\u003e!! (:ctl-ch consumer) {:op :close})\n```\n\n## Usage\n\nGregor provides 2 public namespaces: `gregor.consumer` for creating and interacting with a KafkaConsumer and `gregor.producer` for creating and interacting with a KafkaProducer.\n\n### Creating a Consumer\n\nTo create a consumer, we first need to pull the consumer namespace into our REPL:\n\n```clojure\n(require '[gregor.consumer :as c])\n```\n\nAssuming we have Kafka running on port `9092` of our local machine (the default location if we used the `docker-compose.yaml` file), we can create a consumer connection using the following code:\n\n```clojure\n(def consumer (c/create {:output-policy #{:data :control :error}\n                         :topics :gregor.test\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"\n                                               :group.id \"gregor.consumer.test\"}}))\n;; =\u003e #'user/consumer\n```\n\nLets take a closer look at the configuration map passed to `gregor.consumer/create`:\n\n**`:output-policy`**\n\nThe value of `:output-policy` should be a set containing the types of events we want published to `out-ch`.  There are four types of events that Gregor supports:\n\n* `:data` - Data events are generated by messages read from the configured topic or topics.  This event type is always included in a consumer's `:output-policy`, regardless of what is specified in the user-supplied `:output-policy`.  Without data events, the consumer would not be very useful.\n* `:control` - Control events are generated by the output of control operations sent to the control channel.  We will talk more about the control channel and the supported operations below.\n* `:error` - Error events are generated when errors occur, most commonly when an exception is thrown or invalid control operations are passed to the control channel.\n* `:eof` - An EOF event is sent when the consumer is closed by invoking the `:close` operation.  Like `:data` events, the `:eof` event is always included as part of the `:output-policy` of a consumer.\n\n**`:topics`**\n\nThe value of `:topics` can be a string, a keyword, a vector of strings and keywords or a regular expression:\n\n* A string or keyword will subscribe the consumer to a single topic.  The string or keyword should be a [valid Kafka topic](https://stackoverflow.com/questions/37062904/what-are-apache-kafka-topic-name-limitations).\n* A vector of strings and keywords will subscribe the consumer to each topic in the vector.  The vector can contain both strings and keywords.\n* A regular expression will subscribe the consumer to all topics matching the regular expression.\n\nYou can check out the [KafkaConsumer documentation](https://kafka.apache.org/21/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html) for more information.\n\n**`:kafka-configuration`**\n\nThe value of `:kafka-configuration` should be a map containing Kafka configuration options.  This map will be converted to a Java properties map and passed directly to the KafkaConsumer object during initialization.  The minimum requirements for consumer initialization are:\n\n* `:bootstrap.servers` - A comma-delimited string containing `host:port` pairs inidicating the location of the Kafka server.\n* `:group.id` - A string containing the group id of the consumer, which is used to maintain indexes and handle partitioning\n\nYou can check out the [Kafka consumer configuration documentation](https://kafka.apache.org/documentation/#consumerconfigs) for full treatment of all of the available configuration options.\n\n### Consumer Control Operations\n\nAssuming correct configuration, the result of calling `gregor.consumer/create` is a map containing 2 keys:\n\n* `:out-ch` - The channel that receives all events, including `:data`, `:control`, `:error` and `:eof`.\n* `:ctl-ch` - The channel that is used to send control operations\n\nFor the sake of conveneince, lets bind `out-ch` to the output channel and `ctl-ch` to the control channel:\n\n```clojure\n(def out-ch (:out-ch consumer))\n;; =\u003e #'user/out-ch\n\n(def ctl-ch (:ctl-ch consumer)) \n;; =\u003e #'user/ctl-ch\n```\n\nNow, lets check if we have been properly subscribed to the `gregor.test` topic.  We can query for the list of current subscriptions using the `:subscriptions` control operation.  The output will be written to `out-ch` as a `:control` event.  Here is the code:\n\n```clojure\n(require '[clojure.core.async :as a])\n\n(a/\u003e!! ctl-ch {:op :subscriptions})\n;; =\u003e true\n\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :subscriptions, :subscriptions [\"gregor.test\"], :event :control}\n```\n\nThis is a common pattern in Gregor:  The operation is submitted to the control channel and any results are pushed to the output channel as a `:control` event.\n\nLets look at each control operation that the consumer supports:\n\n**`:subscribe`**\n\nThe `:subscribe` operation is used to subscribe to a topic(s).  Since Gregor requires that the initial topic(s) be passed in during initialization, the `:subscribe` operation will generally only be used if we wish to change the subscription of our consumer.  Should you need to do this, please remember that per the [KafkaConsumer documentation](https://kafka.apache.org/21/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html), **topic subscriptions are not incremental**.  If you wish to remain subscribed to the current list of topics, you will need to include these along with any additions when you call the `:subscribe` operation.\n\nThe code to call `:subscribe` looks like this:\n\n```clojure\n;; Subscribe to `:gregor.test.2`\n(a/\u003e!! ctl-ch {:op :subscribe :topics :gregor.test.2})\n;; =\u003e true\n\n;; Print the output from the `:subscribe` control operation\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :subscribe, :topics :gregor.test.2, :event :control}\n\n;; Query for the list of current subscriptions using the `:subscriptions` operation\n(a/\u003e!! ctl-ch {:op :subscriptions})\n;; =\u003e true\n\n;; Print the output of the `:subscriptions` operation, noting that we are\n;; now subscribed to the \"gregor.test.2\" topic\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :subscriptions, :subscriptions [\"gregor.test.2\"], :event :control}\n\n;; Switch subscription back to \"gregor.test\"\n(a/\u003e!! ctl-ch {:op :subscribe :topics :gregor.test})\n;; =\u003e true\n\n;; Print the output of the `:subscribe` command\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :subscribe, :topics :gregor.test, :event :control}\n```\n\n**`:subscriptions`**\n\nThe `:subscriptions` operation, as we have seen, is used to get the list of topics that the consumer is currently subscribed to.  We have already seen how to call the `:subscriptions` operation above but we include the code here for the sake of completness:\n\n```clojure\n;; Query for the list of current subscriptions using the `:subscriptions` operation\n(a/\u003e!! ctl-ch {:op :subscriptions})\n;; =\u003e true\n\n;; Print the output of the `:subscriptions` operation\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :subscriptions, :subscriptions [\"gregor.test\"], :event :control}\n```\n\nIf the consumer is not subscribed to a topic(s), the `:subscriptions` operation will return an empty vector\n\n**`:unsubscribe`**\n\nThe `:unsubscribe` operation will unsubscribe the consumer from all current topic subscriptions.  It can be invoked using the following code:\n\n```clojure\n;; Unsubscribe from the current topic\n(a/\u003e!! ctl-ch {:op :unsubscribe})\n;; =\u003e true\n\n;; Print the output of the `:unsubscribe` operation\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :unsubscribe, :event :control}\n```\n\nIf you are switching betwen a name-based subscription (i.e. strings or keywords) to a pattern-based subscription (i.e. a regular expression), you must unsubscribe.  If you want to change from one name-based subscription to another or one pattern-based subscription to another, you do not need to unsubscribe.\n\n**`partitions-for`**\n\nThe `:partitions-for` operation will fetch the partition information for the specified topic.  You can use the following code to test `:partitions-for` for the `\"gregor.test\"` topic:\n\n```clojure\n;; Query the partitions for the `:gregor.test` topic\n(a/\u003e!! ctl-ch {:op :partitions-for :topic :gregor.test})\n;; =\u003e true\n\n;; Print the output of the `:partitions-for` operation for the `:gregor.test` topic\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :partitions-for, \n;;     :topic :gregor.test, \n;;     :partitions [{:type-name :partition-info, \n;;                   :isr [{:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}], \n;;                   :offline [], \n;;                   :leader {:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}, \n;;                   :partition 0, \n;;                   :replicas [{:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}], \n;;                   :topic \"gregor.test\"}], \n;;     :event :control}\n```\n\n**`commit`**\n\nThe `:commit` operation will commit the offsets from the last call to poll for the subscribed list of topics.  The default value of the `enable.auto.commit` property in Kafka is `true`, which means that as messages are consumed, the offset will be committed automatically (the interval of auto commits is controlled by the `auto.commit.interval.ms` property which has a default value of `500`).  If you set `enable.auto.commit` to `false` when you create your consumer you will need to manually commit consumed offsets, which can be done with the following code:\n\n```clojure\n;; Commit the last consumed offset for this consumer\n(a/\u003e!! ctl-ch {:op :commit})\n;; =\u003e true\n\n;; Verify that the `:commit` operation was processed succesfully\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :commit, :event :control}\n```\n\nLeaving `enable.auto.commit` set to the default value of `true` is sufficient for most use cases.  Read [this Medium article](https://medium.com/@danieljameskay/understanding-the-enable-auto-commit-kafka-consumer-property-12fa0ade7b65) for more information about auto committing and offsets.\n\n**`:close`**\n\nThe `:close` operation will close the consumer, including all associated channels.  The following code will close the consumer we have been working with:\n\n```clojure\n;; Close the consumer\n(a/\u003e!! ctl-ch {:op :close})\n;; true\n\n;; Verify the `:close` operation was processed\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :close, :event :control}\n\n;; Should receive `:eof` event as the final event indicating the end of the stream/channel\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:event :eof}\n\n;; Further reads result in `nil` as the `out-ch` is closed\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e nil\n\n;; Attempts to write to the `ctl-ch` will fail, returning `false`\n(a/\u003e!! ctl-ch {:op :subscriptions})\n;; =\u003e false\n```\n\nNot only was the `KafkaConsumer` object closed, but so were `out-ch` and `ctl-ch` channels.  Since `out-ch` is a standard `core.async` channel, all messages that were read from the consumer up to the point of the `:close` operation will be avialable.  Once `out-ch` is empty, it wil return `nil` for all future reads.  Any attempts to write to `ctl-ch` will fail, returning `false` as shown above.\n\nNote that the final message deliverd from `out-ch` was the `:eof` event.  Posting an `:eof` event is Gregor's way of telling the user that the stream has been closed.\n\n### Creating a Producer\n\nThe other half of the equation is the producer.  Gregor provides a namespace called `gregor.producer` for creating and interacting with a KafkaProducer object.  The following code can be used to create a producer:\n\n```clojure\n(def producer (p/create {:output-policy #{:data :control :error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"}}))\n;; =\u003e #'user/producer\n```\n\nAs with the consumer, the configuration map we passed to `gregor.producer/create` is worth a closer look:\n\n**`:output-policy`**\n\nAs with the consumer, the `:output-policy` is a set containing the types of events we want published to `out-ch`.  There are four types, the same as with the consumer.  However, the requirments and meaning is different in the context of a producer:\n\n* `:data`: - Data events are generated by serializing the result of the call to the producer's `.send` function.  Note that serializing the result of `.send` is synchronous and, as such, including `:data` as part of a producer's output policy may affect throughput.\n* `:control`: - Similar to the consumer, control events are generated by the output of control operations sent to the control channel.\n* `:error`: - Similar to the consumer, error events are generated when errors occur, most commonly when an exception is thrown or invalid control operations are passed to the control channel.\n* `:eof`: - EOF events are sent when the producer is closed by invoking the `:close` control operation.  The `:eof` event is always included as part of the `:output-policy` of a producer.\n\n**`:kafka-configuration`**\n\nThe value of `:kafka-configuration` should be a map containing Kafka configuration options.  This map will be converted to a Java properties map and passed directly to the KafkaConsumer object during initialization.  The minimum requirements for consumer initialization are:\n\n* `:bootstrap.servers` - A comma-delimited string containing `host:port` pairs inidicating the location of the Kafka server.\n\nYou can check out the [Kafka producer configuration documentation](https://kafka.apache.org/documentation/#producerconfigs) for a full treatment of all of the available configuration options.\n\n### Producer Control Operations\n\nAssuming correct configuration, the result of calling `gregor.producer/create` is a map containing 3 keys:\n\n* `:in-ch` - The channel used to send `:data` events to the Kafka producer\n* `:out-ch` - The channel that receives all events, including `:data`, `:control`, `:error` and `:eof`\n* `:ctl-ch` - The channel used to send control operations\n\nFor the sake of conveneince, lets bind `out-ch` to the output channel and `ctl-ch` to the control channel:\n\n```clojure\n(def out-ch (:out-ch producer))\n;; =\u003e #'user/out-ch\n\n(def ctl-ch (:ctl-ch producer))\n;; =\u003e #'user/ctl-ch\n```\n\nThe producer currently supports three control operations:\n\n**`:partitions-for`** \n\nThe `:partitions-for` operation will fetch the partition information for the specified topic.  You can use the following code to test `:partitions-for` for the `\"gregor.test\"` topic:\n\n```clojure\n;; Query the partitions for the `:gregor.test` topic\n(a/\u003e!! ctl-ch {:op :partitions-for :topic :gregor.test})\n;; =\u003e true\n\n;; Print the output of the `:partitions-for` operation for the `:gregor.test` topic\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :partitions-for, \n;;     :topic :gregor.test, \n;;     :partitions [{:type-name :partition-info, \n;;                   :isr [{:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}], \n;;                   :offline [], \n;;                   :leader {:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}, \n;;                   :partition 0, \n;;                   :replicas [{:type-name :node, :host \"127.0.0.1\", :id 1, :port 9092}], \n;;                   :topic \"gregor.test\"}], \n;;     :event :control}\n```\n\n**`:flush`**\n\nInvoking the `:flush` operation will make all buffered records immediately available to send.  Note that `:flush` will cause the producer thread to block until all buffered records have been sent.  You can invoke `:flush` with the following code:\n\n```clojure\n;; Invoke flush via the control channel\n(a/\u003e!! ctl-ch {:op :flush})\n;; =\u003e true\n\n;; Print the output of the `:flush` operation\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :flush, :event-ctl}\n```\n\n**`:close`**\n\nThe `:close` operation will close the producer, including all associated channels.  The following code will close the producer we have been working with:\n\n```clojure\n;; Close the consumer\n(a/\u003e!! ctl-ch {:op :close})\n;; true\n\n;; Verify the `:close` operation was processed\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:op :close, :event :control}\n\n;; Should receive `:eof` event as the final event indicating the end of the stream/channel\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e {:event :eof}\n\n;; Further reads result in `nil` as the `out-ch` is closed\n(-\u003e (a/\u003c!! out-ch) pr-str println)\n;; =\u003e nil\n\n;; Attempts to write to the `ctl-ch` will fail, returning `false`\n(a/\u003e!! ctl-ch {:op :subscriptions})\n;; =\u003e false\n\n;; Attempts to write to the `in-ch` will fail, returning `false`\n(a/\u003e!! (:in-ch producer) {:foo :bar})\n;; =\u003e false\n```\n\nNot only was the `KafkaProducer` object closed, but so were `in-ch`, `out-ch` and `ctl-ch` channels.  Since `out-ch` is a standard `core.async` channel, all messages that were read from the producer up to the point of the `:close` operation will be avialable.  Once `out-ch` is empty, it wil return `nil` for all future reads.  Any attempts to write to `ctl-ch` will fail, returning `false` as shown above.\n\nNote that the final message deliverd from `out-ch` was the `:eof` event.  Posting an `:eof` event is Gregor's way of telling the user that the stream has been closed.\n\n### Sending and Receiving Data\n\nNow that we have seen how to create and work with a consumer and producer in isolation, it is time for them to work together to send some data across a Kafka topic.  If you have not already done so, close the consumer and producer that you created above.  You can us the following code to do so:\n\n```clojure\n;; Close the consumer\n(-\u003e (:ctl-ch consumer) (a/\u003e!! {:op :close}))\n;; =\u003e true\n\n;; Close the producer\n(-\u003e (:ctl-ch :producer) (a/\u003e!! {:op :close}))\n;; =\u003e true\n```\n\nOnce the old consumer and producer are closed, we can create new instances of a consumer and a producer which we will use to send a message via Kafka:\n\n```clojure\n;; Create a consumer, subscribed to the `:gregor.test.send` topic\n(def consumer (c/create {:output-policy #{:data :control :error}\n                         :topics :gregor.test.send\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"\n                                               :group.id \"gregor.consumer.test\"}}))\n;; =\u003e #'user/consumer\n\n;; Create a producer\n(def producer (p/create {:output-policy #{:control :error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"}}))\n;; =\u003e #'user/producer\n```\n\nA couple of things to note here.  First, we subscribed to a different topic than in the previous consumer example.  This is simply to make sure there are no messages hanging out on a previously existing topic and we are starting with a clean slate.  Second, we removed the `:data` event from the `:output-policy` for the producer.  This will keep Gregor from dereferencing the result of the call to `.send`, thus maintaining the asynchronous-ness of message production.\n\nNow that we have a valid consumer and producer, we can send messages between them:\n\n```clojure\n;; Bind a symbol to the input channel of the producer\n(def in-ch (:in-ch producer))\n;; =\u003e #'user/in-ch\n\n;; Bind a symbol to the output channel of the consumer\n(def out-ch (:out-ch consumer))\n;; =\u003e #'user/out-ch\n\n;; Create message body (we will examine this closer in the next section)\n(def msg {:topic :gregor.test.send :message-key {:uuid (str (java.util.UUID/randomUUID))} :message-value {:a 1 :b 2}})\n;; =\u003e #'user/msg\n\n;; Write data to the producer\n(a/\u003e!! in-ch msg)\n;; =\u003e true\n\n;; Read the message from the consumer\n(a/\u003c!! out-ch)\n;; =\u003e {:message-key {:uuid \"5380d92d-5997-4515-99ed-1717bf21a01e\"},\n;;     :offset 2,\n;;     :message-value-size 12,\n;;     :topic \"gregor.test.send\",\n;;     :message-key-size 46,\n;;     :partition 0,\n;;     :message-value {:a 1, :b 2},\n;;     :event :data,\n;;     :type-name :consumer-record,\n;;     :timestamp 1554751618612}\n```\n\nAs you may have noticed, the message we get out of the consumer has a bit more information than what we passed into the producer.  This is because Gregor includes all of the data provided by the Kafka [`ConsumerRecord`](https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/consumer/ConsumerRecord.html) object.  Most of the time, we will only be interested in the `:message-key` and `:message-value` keys, which contain the data.\n\n### Anatomy of a Message\n\nAs promised, lets look a little closer at the keys and values of a message body.  There are 2 required keys, the `:topic` which contains a string or keyword containing the message's target topic, and the `:message-value`, which is the data that we want to send.  Additionally, there are 2 optional keys.  The first is `:message-key` which is arbitrary metadata about the `:message-value`.  The second is `:partition`, which is a valid partition index of the topic where the message should be delivered.\n\nAll messages must contain a `:topic` and a `:message-value`.  We consider it good practice to also include a `:message-key` with each message, though it is not required by Kafka or Gregor.  Generally speaking, we try to avoid specifying a partition and let Kafka handle distributing the work across available partitions.  However, if you have a good reason, you can target a specific partition within the specified topic.\n\n### Producer Transducer\n\nGregor supports producer transducers.  This feature allows the user to specify a transducer that will be attached to the input channel of a producer when upon creation.  The transducer will be applied to each message sent to the input channel before it is serialized as a Kafka [`ProducerRecord`](https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html).\n\nIn this example, we will use a transducer to derive the `:message-key` and `:message-value` from the map that is passed into the input channel.  In addition, the transducer will add a `:producer-timestamp` to the `:message-value` for use by the consumer:\n\n```clojure\n;; Function to add a `:producer-timestamp`\n(defn add-timestamp\n  [m]\n  (assoc m :producer-timestamp (.toEpochMilli (java.time.Instant/now)))\n\n;; Function to set the `:message-value`\n(defn add-message-value\n  [m]\n  (-\u003e\u003e (dissoc m :uuid)\n       (assoc m :message-value)))\n\n;; Function to set the `:message-key`\n(defn add-message-key\n  [{:keys [uuid] :as m}]\n  (-\u003e (assoc m :message-key {:uuid uuid})\n      (dissoc :uuid)))\n\n;; Function to set the `:topic`\n(defn add-topic\n  [topic m]\n  (assoc m :topic topic))\n\t   \n;; Compose the transducer\n(def transducer (comp (map add-timestamp)\n                      (map add-message-value)\n                      (map add-message-key)\n                      (map (partial add-topic :gregor.test.send))))\n```\n\nIn addition to adding some meta-data about the message, this transducer transforms the message into something that Gregor can understand.  Every message that passes through the input channel will come out the other side with a `:topic`, a `:message-value` and a `:message-key`, guarenteeing that all messages are correctly shaped.\n\n(As a side note, since the `:message-key` is not required, one possible enhancement to the transducer is to leave out the `:message-key` if no UUID is found in the map.\n\nAlternatively, we could generate a UUID if one is not found, similar to how we generated a timestamp.  \n\nThe correct answer to these types of questions will largely be contextual to the problem that is being solved.  Suffice to say, Gregor's ability to accept a user-defined transducer should facilitate an elegant implementation regardless of what the requirements dictate.)\n\nNow that we have our transducer, lets put it to work:\n\n```clojure\n;; Close the previous producer\n(a/\u003e!! ctl-ch {:op :close})\n;; =\u003e true\n\n;; Create a new producer with the above defined transducer\n(def producer (p/create {:output-policy #{:control :error}\n                         :kafka-configuration {:bootstrap.servers \"localhost:9092\"}\n\t\t\t\t\t\t :transducer transducer}))\n;; =\u003e #'user/producer\n\n;; Send data to the producer.  Note that we have not specified the `:topic` or the `:message-value`.\n(a/\u003e!! (:in-ch producer) {:uuid (java.util.UUID/randomUUID) :a 1 :b 2 :c 3})\n;; =\u003e true\n\n;; Read the message from the consumer\n(a/\u003c!! out-ch)\n;; =\u003e {:message-key {:uuid #uuid \"fd32a035-153d-4fe8-837e-d5bd33985cd0\"},\n;;     :offset 3,\n;;     :message-value-size 53,\n;;     :topic \"gregor.test.send\",\n;;     :message-key-size 52,\n;;     :partition 0,\n;;     :message-value {:a 1, :b 2, :c 3, :producer-timestamp 1554755327399},\n;;     :event :data,\n;;     :type-name :consumer-record,\n;;     :timestamp 1554755327411}\n```\n\nAs you can see, the transducer was applied to the input channel of the producer, resulting in a correctly shaped message which was then read off of the topic by the consumer.\n\n## TODO List\n\n#### Transducer Exception Handling\n\nChannels with transducers support having an exception handling function.  Need to create one so that any exceptions that a transducer throws are caught and converted to `:error` events.  In the case of the consumer, the event will automatically be sent to the `out-ch`.  In the case of the producer, the transducer is on the `in-ch` so Gregor will need to recognize that an error ocurred (by checking the event type) and route it to the producer's `out-ch`.\n\n#### Exception handling for control events\n\nReview the exceptions that the various `KafkaProducer` and `KafkaConsumer` methods can throw.  Make sure that Gregor has exception handling for each.  The exception handling should convert the exception to an `:error` event and post it to the `out-ch` of the consumer or producer.  Additionally, in the context of the consumer, the consumer control loop must be left in a correct state.  That is, after the exception is handled and routed correctly, a message needs to be sent to the `ctl-ready-ch` indicating that the control loop is ready for the next control operation.\n\n#### Enhance `:commit` Control Operation\n\nCurrently, the `:commit` control operation takes no parameters.  It simply forces a commit to happen for all partitions for each of the currently subscribed topics.  To facilitate a finer control for the user over commits, the `:commit` operation should be able to target specific partitions of specific topics with specific offsets.  Doing so will allow the user to disable auto-commits and be explicit about committing when a message is handled.\n\n## License\n\nCopyright © 2019 NovoLabs, Inc.\n\nDistributed under the BSD-3-clause LICENSE\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnovolabs%2Fgregor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnovolabs%2Fgregor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnovolabs%2Fgregor/lists"}