{"id":25875622,"url":"https://github.com/acrolinx/clj-queue-by","last_synced_at":"2025-03-02T10:18:43.506Z","repository":{"id":57713516,"uuid":"104053407","full_name":"acrolinx/clj-queue-by","owner":"acrolinx","description":"A queue which schedules fairly by key","archived":false,"fork":false,"pushed_at":"2022-02-22T15:25:54.000Z","size":61,"stargazers_count":16,"open_issues_count":0,"forks_count":1,"subscribers_count":18,"default_branch":"main","last_synced_at":"2024-04-28T07:51:04.318Z","etag":null,"topics":["clojure","queue","scheduled-jobs"],"latest_commit_sha":null,"homepage":"","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/acrolinx.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-09-19T09:14:51.000Z","updated_at":"2023-07-07T06:43:32.000Z","dependencies_parsed_at":"2022-09-18T06:34:29.659Z","dependency_job_id":null,"html_url":"https://github.com/acrolinx/clj-queue-by","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acrolinx%2Fclj-queue-by","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acrolinx%2Fclj-queue-by/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acrolinx%2Fclj-queue-by/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acrolinx%2Fclj-queue-by/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/acrolinx","download_url":"https://codeload.github.com/acrolinx/clj-queue-by/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241465091,"owners_count":19967243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","queue","scheduled-jobs"],"created_at":"2025-03-02T10:18:42.846Z","updated_at":"2025-03-02T10:18:43.495Z","avatar_url":"https://github.com/acrolinx.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/acrolinx/clj-queue-by.svg?branch=master)](https://travis-ci.org/acrolinx/clj-queue-by)\n[![Clojars Project](https://img.shields.io/clojars/v/com.acrolinx/clj-queue-by.svg)](https://clojars.org/com.acrolinx/clj-queue-by)\n\n# Queue-by\n\nA queue which schedules fairly by key.\n\n## Motivation\n\nWe developed this library with a program in mind that requires a\ncentral in-memory queue. The queue must allow the program to serve\nactive users in a timely manner while still ensuring that users with\nmassive traffic get their job done eventually.\n\nWe considered other options like\nClojure's [core.async](https://github.com/clojure/core.async),\n`clojure.lang.PersistentQueue`, and\nJava's\n[java.util.PriorityQueue](https://docs.oracle.com/javase/8/docs/api/java/util/PriorityQueue.html) but\nnone met the requirements.\n\nSo we wrote `queue-by` which does what we need and has a nice name,\ntoo.\n\n## Usage\n\nTo use this library in your project, add the following to your\n`:dependencies` in your `project.clj` or `build.boot`:\n\n    [com.acrolinx.clj-queue-by \"0.1.1\"]\n\nIf you use `deps.edn` as a dependency declaration file, add the following\nto `deps.edn`: \n\n    com.acrolinx/clj-queue-by {:mvn/version \"0.1.1\"}\n\nTo create a queue, `require` the `com.acrolinx.clj-queue-by` namespace\nand call `queue-by`:\n\n    (ns test-the-queue.core\n       (:require [com.acrolinx.clj-queue-by :as q]))\n    \n    (def queue (q/queue-by :name))\n\nHere we create the queue with a `key-fn` `:name`, so items will get a\ndedicated queue per `:name`. You can store the queue in an atom,\ntoo. Alternatively, create it as a local variable and pass it\naround.\n\nFor a detailed description of the scheduling algorithm per key,\nsee [The Scheduling Mechanism](#the-scheduling-mechanism) below.\n\nThe queue can have a maximum size. In the first example, we stick to\nthe default maximum of 128:\n\n    (def queue (q/queue-by :name))\n\nIf you want to have a different limit, call it with a second argument:\n\n    (def queue (q/queue-by :name 1000))\n\nIf the second argument is an explicit `nil` the queue is unbounded and\nno size checks are performed. Use at your own risk.\n\n    (def queue (q/queue-by :name nil))\n\nAdd an item to the queue by calling it with the item as the argument:\n\n    (queue {:name \"alice\" :a 1})\n\nWhen you try to push more items than the limit, an exception is\nthrown.\n\n    (def queue (q/queue-by :id 1))\n    (queue {:id 1})\n    (try (queue {:id 2})\n           (catch clojure.lang.ExceptionInfo e\n             (ex-data e)))\n    ;=\u003e {:item {:id 2}, :current-size 1}\n\nCalling the queue like a function works, because it implements the\n`IFn` interface which is used when calling functions in Clojure.\n\nHow many items are in the queue?\n\n    (count queue)\n\nThe queue implements `Counted`, the interface behind the `count`\nfunction. Performance guarantees are inherited from Clojure's hash map\nand `clojure.lang.PersistentQueue` which are used under the hood.\n    \nWhat's inside the queue?\n\n    (deref queue)\n    @queue\n\nDereferencing the queue returns a two-element vector: first the\ncurrent snapshot queue (a `clojure.lang.PersistentQueue`), second a\nhash-map with the per-key queues (again\n`clojure.lang.PersistentQueue`). Works by implementing `IDeref`. The\ndereferenced information can be used for monitoring.\n    \nFinally, read an item from the queue by calling it without an\nargument.\n\n    (queue)\n\nReading from the queue returns `nil` when no item is in the queue.\n\n## Nil\n\nThe queue allows you to add `nil` items but you won't be able to\ndistinguish at the receiving end if `nil` was in the queue or the\nqueue was empty.\n\nAlso, `nil` gets its own queue when the `key-fn` returns `nil` just as\nany other value.\n\n## Comparison with core.async\n\n* The `core.async` library is much more sophisticated and much more\n  powerful. At the same time, it is also harder to use.\n* `clj-queue-by` doesn't support transducers while `core.async` does.\n* The buffers in `core.async` which back the channels are surprisingly\n  intransparent. You can't look into them or log when things are being\n  dropped. Also, channels do not support derefing and\n  counting. Probably for good reasons, but the use-case that triggered\n  the development of `clj-queue-by` required more introspection and\n  transparency. In `core.async`, you can overcome all these limitations\n  if you implement your own buffer to back a channel. In fact, we did\n  this previously.\n* `core.async` is battle-proven and has shown that it runs well in\n  production. `clj-queue-by` is just beginning to show it. \n* Channels in `core.async` are meant to be used a lot. You can easily\n  create tens or hundreds of them. In contrast, `clj-queue-by` was\n  developed to be the central in-memory queue for a program.\n\n## The Scheduling Mechanism\n\nOn the sending and the receiving end, the queue behaves just like any\nother queue. Internally though, items are put into separate queues\ngiven by the `key-fn` you define.\n\nThe scheduling mechanism is somewhat related to\nthe\n[Completely Fair Scheduling](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) algorithm\nused in the Linux Kernel and\nthe\n[Fair Scheduler](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/FairScheduler.html) used\nin Apache Hadoop.\n\nImagine having hash-map items with a `:name` key in it:\n\n     {:name \"alice\"\n      :data 1}\n\nIf you push several items with different names into the queue, it\ncreates separate queues per key.\n\n    \"alice\": item1, item2\n    \"bob\":   item3\n\nNow, if Alice puts many items into a common FIFO queue and the\nconsumption of the items takes a while, Bob would have to wait a long\ntime for his thing to happen.\n\nTo solve this problem, this queue implementation always takes off the\nleading items of the queue per user and delivers them.\n\n1. Push `{:name \"alice\" :data 1}`\n2. Push `{:name \"alice\" :data 2}`\n3. Push `{:name \"alice\" :data 3}`\n4. Push `{:name \"bob\"   :data \"x\"}`\n\nAfter these pushes, the queue looks like this internally:\n\n    alice: {:name \"alice\" :data 1},\n           {:name \"alice\" :data 2},\n           {:name \"alice\" :data 3},\n    bob:   {:name \"bob\"   :data \"x\"}\n    \nWhen you start pulling, it will deliver the items in this order:\n\n1. Pull `{:name \"alice\" :data 1}`\n2. Pull `{:name \"bob\"   :data \"x\"}`\n3. Pull `{:name \"alice\" :data 2}`\n4. Pull `{:name \"alice\" :data 3}`\n\nNote, how Bob's item got delivered before Alice's second item.\n\nThink of it as taking a snapshot of the heads of the queue when\npolling and then delivering this snapshot. When it's empty, a new\nsnapshot is taken. Actually, this is exactly what happens behind the\nscenes.\n\nAnother example:\n\n1. Push `{:name \"alice\" :data 1}`\n2. Push `{:name \"bob\"   :data \"x\"}`\n3. Pull `{:name \"alice\" :data 1}`. This takes a snapshot and delivers\n   the oldest item. The item from Bob is now the head of the snapshot.\n4. Push `{:name \"alice\" :data 2}`. This adds the new item after the\n   snapshot.\n5. Push `{:name \"alice\" :data 3}`\n6. Pull `{:name \"bob\"   :data \"x\"}`\n7. Pull `{:name \"alice\" :data 2}`\n8. Pull `{:name \"alice\" :data 3}`\n\nA last example:\n\n1. Push `{:name \"alice\" :data 1}`\n2. Push `{:name \"bob\"   :data \"x\"}`\n3. Push `{:name \"alice\" :data 2}`\n4. Pull `{:name \"alice\" :data 1}`. This takes a snapshot and delivers\n   the oldest item. The item from Bob is now the head of the\n   snapshot. Alice's second item stays on her dedicated queue.\n5. Push `{:name \"alice\" :data 3}`\n6. Push `{:name \"alice\" :data 4}`\n7. Pull `{:name \"bob\"   :data \"x\"}`. Was head of the snapshot.\n8. Push `{:name \"bob\"   :data \"y\"}`\n9. Pull `{:name \"alice\" :data 2}`. A new snapshot is created with the\n   head items of both Alice and Bob added. Thus, Bob's item overtakes\n   Alice's larger queue of items.\n10. Pull `{:name \"bob\" :data \"y\"}`\n11. Pull `{:name \"alice\" :data 3}`\n12. Pull `{:name \"alice\" :data 4}`\n\nIf we reduce the items to just the `:data`, the queue in this example\nwent through the following internal states (empty queues suppressed):\n\n    1.\n    alice: data 1\n    \n    2.\n    alice: data 1\n    bob:   data x\n    \n    3.\n    alice: data 1, data 2\n    bob:   data x\n    \n    4.\n    SNAPSHOT:    bob/data x\n    alice:       data 2\n    -\u003e Returned: alice/data 1\n\n    5.\n    SNAPSHOT:    bob/data x\n    alice:       data 2, data 3\n\n    6.\n    SNAPSHOT:    bob/data x\n    alice:       data 2, data 3, data 4\n\n    7.\n    alice:       data 2, data 3, data 4\n    -\u003e Returned: bob/data x\n    \n    8.\n    alice:       data 2, data 3, data 4\n    bob:         data y\n    \n    9.\n    SNAPSHOT:    bob/data y\n    alice:       data 3, data 4\n    -\u003e Returned: alice/data 2\n    \n    10.\n    alice:       data 3, data 4\n    -\u003e Returned: bob/data y\n    \n    11.\n    alice:       data 4\n    -\u003e Returned: alice/data 3\n\n    11.\n    -\u003e Returned: alice/data 4\n\n## Contributing\n\nThis project accepts pull request from registered authors. We ask you\nto sign our Contributor License Agreement first. Please reach out to\nus at https://www.acrolinx.com/contact/ so that we can prepare\neverything.\n\nDon't panic. It is just some easy legalese that helps us maintain the\navailability of this software for everyone.\n\n## Contributors\n\n* @replrep :computer: :mag:\n* @nblumoe :computer: :mag:\n\nType of contribution:\n\n* :mag: Feedback, review\n* :computer: Code, pull request\n\n## License\n\nCopyright © 2017-2019 Acrolinx GmbH\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Facrolinx%2Fclj-queue-by","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Facrolinx%2Fclj-queue-by","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Facrolinx%2Fclj-queue-by/lists"}