{"id":16660301,"url":"https://github.com/kingmob/truegrit","last_synced_at":"2025-05-16T17:07:57.056Z","repository":{"id":62433823,"uuid":"452194530","full_name":"KingMob/TrueGrit","owner":"KingMob","description":"A data-driven, functionally-oriented, idiomatic Clojure library for circuit breakers, bulkheads, retries, rate limiters, timeouts, etc.","archived":false,"fork":false,"pushed_at":"2025-01-03T12:29:27.000Z","size":111,"stargazers_count":127,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-05-16T17:07:20.749Z","etag":null,"topics":["bulkhead","circuit-breaker","clojure","clojure-library","resilience","resilience4j"],"latest_commit_sha":null,"homepage":"","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KingMob.png","metadata":{"files":{"readme":"README.adoc","changelog":"CHANGELOG.adoc","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"KingMob"}},"created_at":"2022-01-26T08:22:45.000Z","updated_at":"2025-05-12T10:09:11.000Z","dependencies_parsed_at":"2024-12-24T04:18:19.940Z","dependency_job_id":"c76003ab-8e30-4994-949d-732f2a461889","html_url":"https://github.com/KingMob/TrueGrit","commit_stats":{"total_commits":27,"total_committers":1,"mean_commits":27.0,"dds":0.0,"last_synced_commit":"d463d608d6bdcfecac77238acfb46cb1f246e1e5"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingMob%2FTrueGrit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingMob%2FTrueGrit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingMob%2FTrueGrit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingMob%2FTrueGrit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KingMob","download_url":"https://codeload.github.com/KingMob/TrueGrit/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254573589,"owners_count":22093731,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bulkhead","circuit-breaker","clojure","clojure-library","resilience","resilience4j"],"created_at":"2024-10-12T10:28:40.290Z","updated_at":"2025-05-16T17:07:57.034Z","avatar_url":"https://github.com/KingMob.png","language":"Clojure","funding_links":["https://github.com/sponsors/KingMob"],"categories":[],"sub_categories":[],"readme":"= README\n\nimage:https://img.shields.io/clojars/v/net.modulolotus/truegrit.svg[clojars badge, link=https://clojars.org/net.modulolotus/truegrit] image:https://cljdoc.org/badge/net.modulolotus/truegrit[cljdoc badge, link=https://cljdoc.org/d/net.modulolotus/truegrit]\n\n== True Grit\n\nFor when you need a function that won't give up at the first sign of failure.\n\nimage::./true-grit-bridges.jpg[True Grit,float=\"right\"]\n\n=== About\n\nTrue Grit enables flexible responses to function failure. True Grit is a \ndata-driven, functionally-oriented, idiomatic wrapper library for using\nResilience4j, one of the top resilience Java libraries.\n\nIt offers:\n\n* *timeouts* - throw an exception if a function takes too long\n* *rate limiters* - limit the number of calls to a function per time period\n* *retries* - retry a function if it fails\n* *circuit breakers* - track function failure rates, and block calls to the fn if it fails too often\n* *bulkheads* - to reserve capacity and partition usage\n\nThe most common way to use True Grit is to wrap functions that call out to an underlying\nnetwork service. With True Grit, you can do things like say \"Give this network \ncall 5s before it's considered to time out, retry it up to 3 times in case of \nintermittent network failure, track how many calls fail overall, and if more than \n50% do, temporarily halt the function so we don't hammer the downstream service.\"\n\nStart with the `net.modulolotus.truegrit` namespace, and see the individual \npolicy namespaces if you have more advanced needs. It contains all-in-one \nfunctions that take a function and a config map, and return a wrapped \nfunction with the resilience policy attached. All wrapped functions return\nthe same results as the original (except for thread-pool-based bulkheads, \nwhich return Futures).\n\nDocs are available https://cljdoc.org/d/net.modulolotus/truegrit[here].\n\nTrue Grit is stable and largely done. The lack of recent commits does _not_\nmean it is abandoned. I am still actively maintaining it, but the only planned \nfuture work is for bug fixes and Resilience4j updates. Suggestions\nfor feature requests welcome, but no promises.\n\n=== Installation\n\nAdd the following to your Leiningen `project.clj`:\n\n----\n[net.modulolotus/truegrit \"2.3.35\"]\n----\n\nAdd the following to your `deps.edn`:\n\n----\nnet.modulolotus/truegrit {:mvn/version \"2.3.35\"}\n----\n\n\n=== Why should you use it?\n\nWrapper libraries are frequently more hassle than doing interop directly. But in\nResilience4j's case, while it's an _excellent_ library, it requires coordinating \ndozens of Java classes to get anything done. So, if you don't feel like wrangling \na bunch of Config/Builder/Registry/etc classes, True Grit is for you. \n\n=== Dependencies\n\nAs of Resilience4j 2.x and thus, True Grit 2.x, the minimum Java version is 17.\nIf you need support for earlier Java versions, stick with the 1.x versions of\nTrue Grit. (It has the exact same API and functionality.)\n\n=== Documentation\n\nBefore using circuit breakers and bulkheads, be sure to understand how they \noperate. I highly recommend _Release It!_ to understand the ways distributed \nsystems can fail and how to compensate for them.\n\nSee:\n\n* True Grit docs - https://cljdoc.org/d/net.modulolotus/truegrit\n* Resilience4j docs - https://resilience4j.readme.io/\n* Circuit breaker pattern - https://www.martinfowler.com/bliki/CircuitBreaker.html\n* _Release It!_ book - https://pragprog.com/titles/mnee2/release-it-second-edition/\n* Hystrix circuit breaker wiki - https://github.com/Netflix/Hystrix/wiki\n\n=== Examples\n\n===== Basic usage\n[source,clojure]\n----\n(require '[net.modulolotus.truegrit :as tg])\n\n(def resilient-fn\n  (-\u003e flaky-fn\n      ;; Give each individual call up to 10s to complete\n      (tg/with-time-limiter {:timeout-duration 10000})\n\n      ;; Try up to 5 times, waiting 1s between failures\n      ;; Will retry if an exception is thrown by default, but will also retry if\n      ;; the return value is nil\n      (tg/with-retry {:name            \"my-retry\"\n                      :max-attempts    5\n                      :wait-duration   1000\n                      :retry-on-result nil?})\n\n      ;; If it still fails after 5 tries, record it as a failure in the CB\n      ;; CB will go into OPEN status if 20% of calls end up failures\n      ;; CB will wait for at least 40 calls before considering a change in status,\n      ;; giving it time to warm up.\n      ;; Ignores UserCanceledExceptions, since if the user hit \"Cancel\", it's not a\n      ;; problem in the underlying service\n      (tg/with-circuit-breaker {:name                    \"my-circuit-breaker\"\n                                :failure-rate-threshold  20\n                                :minimum-number-of-calls 40\n                                :ignore-exceptions       [UserCanceledException]})))\n----\n\n===== Use a shared circuit breaker to track an underlying service called by many fns\n[source,clojure]\n----\n(require '[net.modulolotus.truegrit.circuit-breaker :as cb])\n\n(def rest-service-cb (cb/circuit-breaker \"shared-rest-service\"\n                                         {:failure-rate-threshold 30\n                                          :minimum-number-of-calls 10}))\n\n(def resil-get (cb/wrap flaky-get rest-service-cb))\n(def resil-post (cb/wrap flaky-post rest-service-cb))\n(def resil-put (cb/wrap flaky-put rest-service-cb))\n(def resil-patch (cb/wrap flaky-patch rest-service-cb))\n(def resil-delete (cb/wrap flaky-delete rest-service-cb))\n----\n\n===== Check circuit breaker to choose an alternative method if status is OPEN\n[source,clojure]\n----\n(require '[net.modulolotus.truegrit.circuit-breaker :as cb])\n\n(if (-\u003e resilient-fn\n        (cb/retrieve)           ; retrieve associated CircuitBreaker\n        (cb/call-allowed?))     ; is a call allowed right now?\n  (resilient-fn)                ; if so, make the call\n  (some-fallback-fn))           ; if not, we can't wait, try a fallback\n----\n\n\n===== Use semaphore-based bulkheads to limit database access, keep 20% capacity in reserve, and log reserved metrics\n[source,clojure]\n----\n(require '[net.modulolotus.truegrit.bulkhead :as bh])\n\n(defn database-query-fn\n  \"Some database fn that we've determined can only handle 100 simultaneous queries\"\n  [user]\n  ;; do some db stuff\n  )\n\n;; Make a default version that can use up to 80% of the database's capacity\n(def default-database-query (tg/with-bulkhead database-query-fn\n                                              {:name \"default-db-bulkhead\"\n                                               :max-concurrent-calls 80}))\n\n;; Make a version that reserves 20% for special needs\n(def reserved-database-query (tg/with-bulkhead database-query-fn\n                                               {:name \"reserved-db-bulkhead\"\n                                                :max-concurrent-calls 20}))\n\n;; Usage\n(defn some-handler-fn\n  [user]\n  (if (user-is-special-somehow user)   ; Is the user a VIP, sysadmin, etc?\n    (reserved-database-query user)     ; Make reserved call - the default bulkhead being full has no impact here\n    (default-database-query user)))    ; Make standard call, blocking if unavailable\n\n;; Log reserved bulkhead metrics every 10s\n(future\n  (loop []\n    (-\u003e reserved-database-query\n        (bh/retrieve)\n        (bh/metrics)\n        (log/debug))\n    (Thread/sleep 10000)\n    (recur)))\n----\n\n=== Guidelines and Notes\n\n[cols=\"s,a\"]\n|===\n\n|Circuit breaker status shorthand\n|CLOSED is good, OPEN is bad. Think of electricity flowing.\n\n|Read up on bulkheads and circuit breakers before using them\n|Seriously.\n\n|Circuit breakers should _never_ be created on-demand\n|Circuit breakers work by collecting data about a function's success/failure rate over time. If you create a CB on the fly (like for an anonymous fn), but you only call that particular fn one time, the CB is useless. If you need to construct fns on the fly, but still track their overall success, you should create a CB ahead of time, and share it with all the anonymous fns by using `cb/wrap`.\n\n|Retries only make sense if there's a reasonable expectation the fn will succeed within an acceptable time frame\n|They're better-suited for temporary glitches in the matrix, not a service being down all day. If the fn doesn't succeed in time, retries can make things _worse_, by adding to the downstream load, which is why pairing them with circuit breakers works well.\n\n|Be mindful of interactions at different levels of the system\n|E.g., wrapping a high-level fn with a retry policy of 3 attempts that calls an\nAWS client lower down that _also_ has its own internal retry policy of 3 attempts\ncan result in up to 3x3=9 calls under failure modes, exacerbating\nthings.\n\nAnother common example is having multiple timeouts; it's confusing and pointless,\nsince the shortest timeout will trigger first.\n\n|You still need to handle errors\n|No amount of resilience policies can ensure a function will always succeed.\n\n|_Order of wrapping matters_\n|E.g.:\n\n[source,clojure]\n----\n(-\u003e my-fn\n    (with-retry some-retry-config)\n    (with-time-limiter some-timeout-config))\n----\n\nwill retry several times, but if the time limit is up before the tries\nsucceed, it will fail. This is probably not what you want. On the other\nhand:\n\n[source,clojure]\n----\n(-\u003e my-fn\n    (with-time-limiter some-timeout-config)\n    (with-retry some-retry-config))\n----\n\n\nwill make calls with a certain time limit, and only if they return\nfailure or exceed their time limit, will it attempt a retry. If you want\na canonical \"good\" ordering, see the `robustify` example fn in the source.\n|===\n\n==== Non-goals\n\nThe r4j cache module is currently unsupported, since many Clojure/Java\ncaching libraries already exist. However, it could be included, if people\nare interested. Let me know if you want it, or better still, submit a \npatch.\n\nSupporting all the Java frameworks that r4j interoperates with is also a\nnon-goal for now.\n\n==== Future directions\n\nThe r4j registries add virtually nothing over standard Clojure mutable\ncontainers, but the code I wrote for them still exists, so I could add\nthem back if people really need them.\n\nMetric module support may be added, if anyone expresses a need for it.\n\n'''\n\n© 2025 Matthew Davidson\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingmob%2Ftruegrit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingmob%2Ftruegrit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingmob%2Ftruegrit/lists"}