{"id":16541514,"url":"https://github.com/itzg/try-etcd-work-allocator","last_synced_at":"2026-04-24T11:34:11.928Z","repository":{"id":66879764,"uuid":"158108844","full_name":"itzg/try-etcd-work-allocator","owner":"itzg","description":"Experiment with an algorithm to dynamically allocate work via etcd","archived":false,"fork":false,"pushed_at":"2018-12-07T21:03:47.000Z","size":110,"stargazers_count":0,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-01-20T13:44:23.206Z","etag":null,"topics":["algorithm","distributed-computing","etcdv3","spring-boot"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/itzg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-18T17:27:43.000Z","updated_at":"2018-12-07T21:03:46.000Z","dependencies_parsed_at":"2023-02-22T19:45:10.912Z","dependency_job_id":null,"html_url":"https://github.com/itzg/try-etcd-work-allocator","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/itzg/try-etcd-work-allocator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzg%2Ftry-etcd-work-allocator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzg%2Ftry-etcd-work-allocator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzg%2Ftry-etcd-work-allocator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzg%2Ftry-etcd-work-allocator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/itzg","download_url":"https://codeload.github.com/itzg/try-etcd-work-allocator/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzg%2Ftry-etcd-work-allocator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32221576,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T10:26:35.452Z","status":"ssl_error","status_checked_at":"2026-04-24T10:25:27.643Z","response_time":64,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","distributed-computing","etcdv3","spring-boot"],"created_at":"2024-10-11T18:55:13.594Z","updated_at":"2026-04-24T11:34:11.915Z","avatar_url":"https://github.com/itzg.png","language":"Java","readme":"The code in this module allows processes to add/update/delete tasks to a distributed workq and \nallows the workers processing the q to be aware of each other and increase/reduce their share of \nthe total work, as workers come and go.\n\n## Overview\n\n### Data structures\nThe workq is implemented with a set of etcd kv's prefixed with the name \"registry\". Each new task \nin the q is a new kv pair with that prefix.\n\nAs workers pick up tasks from the q, they create an analagous entry in etcd with the prefix \"active\".  \nThat way the other workers know the task is in progress.\n\nFinally, there is a set of kv pairs with the prefix \"workers\".  This allows the workers to know of \nthe existence of the others so they can shed work if the load becomes unbalanced.\n\n### Threads\nThis module manages the work/worker q's by creating three threads, (each of which watches one of \nthe prefixes listed above):\n\nThe registry thread watches for incoming work in the \"registry\" kv pairs and \"grabs\" it in an etcd \ntransaction to ensure no two workers grab the same task.\n\nThe worker thread checks for incoming workers;  as new workers arrive, it determines what should \nbe it's new share of the total load and sheds tasks when it determines it has too many.\n\nFinally, the active thread checks for tasks shed by other processes and attempts to grab them if \nthe current process is underloaded.\n\n## Getting Started\n\n### Running etcd3\n\nEnsure you have an etcd3 instance running locally. If not, you can start one with Docker using\n\n```bash\ndocker run -it --rm -p 2379:2379 quay.io/coreos/etcd\n```\n\n### Start a work allocator\n\nStart a work allocator instance using one the \n[various Spring Boot application approaches](https://docs.spring.io/spring-boot/docs/current/reference/html/getting-started-first-application.html#getting-started-first-application-run). \n\nFor example,\n```bash\nmvn spring-boot:run\n```\n\nYou will need to locate the HTTP port assigned to the instance by locating the log line like\n\n```text\nTomcat started on port(s): 62930 (http)\n```\n\nThe examples below will assume you assign a shell variable called `port` with that value, such as\n```bash\nport=62930\n```\n\n### Add some work\n\nPost some new work definitions to the system using a \"POST\" such as the following with curl. We'll\nadd four work definitions to make things interesting:\n\n```bash\ncurl -XPOST -d \"testing=one\" localhost:$port/work\ncurl -XPOST -d \"testing=two\" localhost:$port/work\ncurl -XPOST -d \"testing=three\" localhost:$port/work\ncurl -XPOST -d \"testing=four\" localhost:$port/work\n```\n\nWith those you will see the one allocator instance picked up all the work:\n```text\nm.i.t.services.WorkAllocator             : Observed readyWork=7bbfeb10-e644-417e-acd2-7edbfca26d89 cause=NEW rev=2024 allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Least loaded, so trying to grab work=7bbfeb10-e644-417e-acd2-7edbfca26d89, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Successfully grabbed work=7bbfeb10-e644-417e-acd2-7edbfca26d89, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=7bbfeb10-e644-417e-acd2-7edbfca26d89, content=testing=one\nm.i.t.services.WorkAllocator             : Observed readyWork=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b cause=NEW rev=2026 allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Least loaded, so trying to grab work=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Successfully grabbed work=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b, content=testing=two\nm.i.t.services.WorkAllocator             : Observed readyWork=2678cda4-5443-48c2-9b8e-6d682ec4959a cause=NEW rev=2028 allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Least loaded, so trying to grab work=2678cda4-5443-48c2-9b8e-6d682ec4959a, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Successfully grabbed work=2678cda4-5443-48c2-9b8e-6d682ec4959a, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=2678cda4-5443-48c2-9b8e-6d682ec4959a, content=testing=three\nm.i.t.services.WorkAllocator             : Observed readyWork=fb555078-6648-4468-afe9-8401c4df1ba7 cause=NEW rev=2030 allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Least loaded, so trying to grab work=fb555078-6648-4468-afe9-8401c4df1ba7, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.WorkAllocator             : Successfully grabbed work=fb555078-6648-4468-afe9-8401c4df1ba7, allocator=41f6235a-0b82-4567-8a47-c7885387adda\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=fb555078-6648-4468-afe9-8401c4df1ba7, content=testing=four\n```\n\nUsing `etcdctl get --endpoints http://localhost:2479 --prefix /work/` we can also confirm the \nstate of etcd after the work allocations (line space added for clarity):\n```text\n/work/active/2678cda4-5443-48c2-9b8e-6d682ec4959a\n41f6235a-0b82-4567-8a47-c7885387adda\n/work/active/7bbfeb10-e644-417e-acd2-7edbfca26d89\n41f6235a-0b82-4567-8a47-c7885387adda\n/work/active/fb555078-6648-4468-afe9-8401c4df1ba7\n41f6235a-0b82-4567-8a47-c7885387adda\n/work/active/fcd8a04d-5698-40f2-b4d9-a6714b15aa5b\n41f6235a-0b82-4567-8a47-c7885387adda\n\n/work/registry/2678cda4-5443-48c2-9b8e-6d682ec4959a\ntesting=three\n/work/registry/7bbfeb10-e644-417e-acd2-7edbfca26d89\ntesting=one\n/work/registry/fb555078-6648-4468-afe9-8401c4df1ba7\ntesting=four\n/work/registry/fcd8a04d-5698-40f2-b4d9-a6714b15aa5b\ntesting=two\n\n/work/workers/41f6235a-0b82-4567-8a47-c7885387adda\n0000000004\n```\n\nNotice under the default prefix of `/work/` there are three keysets that the allocator uses for\ntracking and coordinating amongst the allocator instances:\n\n- **workers**\n  - One for each allocator/worker\n  - Value contains the current work load of that worker\n  - Each key is tied to the worker's lease and will be auto-removed when the worker leaves the system\n- **registry**\n  - One for each work item that needs to be worked\n  - Value contains the content given when created/updated\n- **active**\n  - One for each work item that is actively assigned\n  - Value contains the ID of the worker assigned\n  - Each key is tied to the worker's lease and will be auto-removed when the worker leaves the system\n\n### Start some more work allocators\n\nIf you start two more work allocator instances, you can see that the first one sheds some of its\nwork load to ensure the second and then third allocators/workers have their fair share.\n\nLooking at the first allocator's logs:\n```text\n2018-11-20 15:19:18.095  INFO 87001 --- [     watchers-3] m.i.t.services.WorkAllocator             : Saw new worker=22b2ff48-8a34-4d3d-ab31-c05c9eb121fe\n2018-11-20 15:19:19.105  INFO 87001 --- [pool-1-thread-5] m.i.t.services.WorkAllocator             : Rebalancing workLoad=4 to target=2\n2018-11-20 15:19:19.105  INFO 87001 --- [pool-1-thread-5] m.i.t.services.WorkAllocator             : Shedding work to rebalance count=2\n2018-11-20 15:19:19.105  INFO 87001 --- [pool-1-thread-5] m.i.t.services.WorkAllocator             : Releasing work=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b\n2018-11-20 15:19:19.113  INFO 87001 --- [pool-1-thread-5] m.i.t.services.WorkAllocator             : Releasing work=fb555078-6648-4468-afe9-8401c4df1ba7\n2018-11-20 15:19:19.113  INFO 87001 --- [     watchers-2] m.i.t.services.WorkAllocator             : Observed readyWork=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b cause=RELEASED rev=2041 allocator=d8298d82-b624-4e16-8caf-3208c3ab5193\n2018-11-20 15:19:19.122  INFO 87001 --- [pool-1-thread-2] m.i.t.services.DefaultWorkProcessor      : Stopping work on id=fcd8a04d-5698-40f2-b4d9-a6714b15aa5b, content=testing=two\n2018-11-20 15:19:19.123  INFO 87001 --- [     watchers-2] m.i.t.services.WorkAllocator             : Observed readyWork=fb555078-6648-4468-afe9-8401c4df1ba7 cause=RELEASED rev=2042 allocator=d8298d82-b624-4e16-8caf-3208c3ab5193\n2018-11-20 15:19:19.127  INFO 87001 --- [pool-1-thread-2] m.i.t.services.DefaultWorkProcessor      : Stopping work on id=fb555078-6648-4468-afe9-8401c4df1ba7, content=testing=four\n```\n\nKeep in mind there was some churn as the third allocator entered the system.\n\nUsing the same `etcdctl` command, we can see the work load is even balanced across the three workers:\n```text\n/work/active/13de78a9-a1e4-4e9f-9d71-49cb368240fe\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/9994249a-0ecb-4735-b078-19ce5c4ee20c\n81ffc456-79e4-4273-a684-3d3dc473f139\n/work/active/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/dff30574-fca6-45a0-a0dd-db36142b1e8e\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n\n/work/registry/13de78a9-a1e4-4e9f-9d71-49cb368240fe\ntesting=two\n/work/registry/9994249a-0ecb-4735-b078-19ce5c4ee20c\ntesting=one\n/work/registry/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\ntesting=three\n/work/registry/dff30574-fca6-45a0-a0dd-db36142b1e8e\ntesting=four\n\n/work/workers/81ffc456-79e4-4273-a684-3d3dc473f139\n0000000001\n/work/workers/8b1da48a-7088-498a-9ae6-6245cdc870b1\n0000000001\n/work/workers/fa69632a-023b-44db-93a8-173994fe936a\n0000000002\n```\n\n### Stop an allocator\n\nStop one of the allocators, ideally one with only one work item to keep the example interesting.\nYou can locate the ID of an allocator from the start of its logs, such as\n\n```text\nWe are worker=81ffc456-79e4-4273-a684-3d3dc473f139\n```\n\nLooking at the other allocator with one work item, you'll see it correctly picked up the released\nwork item since it is the least loaded allocator:\n\n```text\nm.i.t.services.WorkAllocator             : Handling potential readyWork=9994249a-0ecb-4735-b078-19ce5c4ee20c at transition=RELEASED\nm.i.t.services.WorkAllocator             : I am leastLoaded=true out of workerCount=2\nm.i.t.services.WorkAllocator             : I am least loaded, so I'll try to grab work=9994249a-0ecb-4735-b078-19ce5c4ee20c\nm.i.t.services.WorkAllocator             : Successfully grabbed work=9994249a-0ecb-4735-b078-19ce5c4ee20c\nm.i.t.services.WorkAllocator             : Stored workLoad=2 update\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=9994249a-0ecb-4735-b078-19ce5c4ee20c, content=testing=one\n```\n\nLooking again with `etcdctl` we can see the work is spread across the now two workers:\n\n```text\n/work/active/13de78a9-a1e4-4e9f-9d71-49cb368240fe\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/9994249a-0ecb-4735-b078-19ce5c4ee20c\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n/work/active/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/dff30574-fca6-45a0-a0dd-db36142b1e8e\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n\n/work/registry/13de78a9-a1e4-4e9f-9d71-49cb368240fe\ntesting=two\n/work/registry/9994249a-0ecb-4735-b078-19ce5c4ee20c\ntesting=one\n/work/registry/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\ntesting=three\n/work/registry/dff30574-fca6-45a0-a0dd-db36142b1e8e\ntesting=four\n\n/work/workers/8b1da48a-7088-498a-9ae6-6245cdc870b1\n0000000002\n/work/workers/fa69632a-023b-44db-93a8-173994fe936a\n0000000002\n```\n\n### Update a work item\n\nNot so interesting, but still important in a real system, is the ability to update an existing \nwork item.\n\nAssuming the shell variable `id` has been set to the UUID of a work item in the registry\nthe following `PUT` will update that work item's content. **NOTE** you might need to\nupdate `port`, if the original instance was stopped.\n\n```bash\ncurl -H \"Content-Type: text/plain\" -X PUT -d \"testing=twotwo\" localhost:$port/work/$id\n```\n\nThe previously assigned worker shows the update was observed and applied:\n```text\nm.i.t.services.WorkAllocator             : Updated our work=13de78a9-a1e4-4e9f-9d71-49cb368240fe\nm.i.t.services.DefaultWorkProcessor      : Updating work on id=13de78a9-a1e4-4e9f-9d71-49cb368240fe, content=testing=twotwo\n```\n\n### Delete some work\n\nWith the shell variable `id` set to one of the work items in the registry, we can start deleting\noff work using:\n\n```bash\ncurl -X DELETE localhost:$port/work/$id\n```\n\nThe one assigned that work item processes the deletion, but also coordinates indirectly with\nthe collective workers to rebalanace:\n\n```text\nm.i.t.services.WorkAllocator             : Stopping our work=13de78a9-a1e4-4e9f-9d71-49cb368240fe\nm.i.t.services.DefaultWorkProcessor      : Stopping work on id=13de78a9-a1e4-4e9f-9d71-49cb368240fe, content=testing=twotwo\nm.i.t.services.WorkAllocator             : Handling potential readyWork=9994249a-0ecb-4735-b078-19ce5c4ee20c at transition=RELEASED\nm.i.t.services.WorkAllocator             : I am leastLoaded=false out of workerCount=2\nm.i.t.services.WorkAllocator             : Handling potential readyWork=13de78a9-a1e4-4e9f-9d71-49cb368240fe at transition=RELEASED\nm.i.t.services.WorkAllocator             : Removed active work=13de78a9-a1e4-4e9f-9d71-49cb368240fe key\nm.i.t.services.WorkAllocator             : I am leastLoaded=false out of workerCount=2\nm.i.t.services.WorkAllocator             : Stored workLoad=1 update\n```\n\nWe can confirm the deletion with `etcdctl`:\n```text\n/work/active/9994249a-0ecb-4735-b078-19ce5c4ee20c\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n/work/active/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/dff30574-fca6-45a0-a0dd-db36142b1e8e\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n\n/work/registry/9994249a-0ecb-4735-b078-19ce5c4ee20c\ntesting=one\n/work/registry/c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\ntesting=three\n/work/registry/dff30574-fca6-45a0-a0dd-db36142b1e8e\ntesting=four\n\n/work/workers/8b1da48a-7088-498a-9ae6-6245cdc870b1\n0000000002\n/work/workers/fa69632a-023b-44db-93a8-173994fe936a\n0000000001\n```\n\nFor fun, let's delete off that one work item from worker `fa69632a-023b-44db-93a8-173994fe936a`,\nwhich is work item `c6574ad8-7a3f-48e2-8c33-7fdedef6d20e`, looking at the `active` keys.\n\n```bash\nid=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\ncurl -X DELETE localhost:$port/work/$id\n```\n\nThat allocator released the deleted work, but because the other allocator initiated a\nrebalance we picked up one of the remaining two items to keep the allocations in balance:\n\n```text\nm.i.t.services.WorkAllocator             : Stopping our work=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\nm.i.t.services.DefaultWorkProcessor      : Stopping work on id=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e, content=testing=three\nm.i.t.services.WorkAllocator             : Removed active work=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e key\nm.i.t.services.WorkAllocator             : Handling potential readyWork=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e at transition=RELEASED\nm.i.t.services.WorkAllocator             : I am leastLoaded=true out of workerCount=2\nm.i.t.services.WorkAllocator             : I am least loaded, so I'll try to grab work=c6574ad8-7a3f-48e2-8c33-7fdedef6d20e\nm.i.t.services.WorkAllocator             : Stored workLoad=0 update\nm.i.t.services.WorkAllocator             : Handling potential readyWork=9994249a-0ecb-4735-b078-19ce5c4ee20c at transition=RELEASED\nm.i.t.services.WorkAllocator             : I am leastLoaded=true out of workerCount=2\nm.i.t.services.WorkAllocator             : I am least loaded, so I'll try to grab work=9994249a-0ecb-4735-b078-19ce5c4ee20c\nm.i.t.services.WorkAllocator             : Successfully grabbed work=9994249a-0ecb-4735-b078-19ce5c4ee20c\nm.i.t.services.WorkAllocator             : Stored workLoad=1 update\nm.i.t.services.DefaultWorkProcessor      : Starting work on id=9994249a-0ecb-4735-b078-19ce5c4ee20c, content=testing=one\n```\n\nFinally, with `etcdctl` we can confirm each allocator has each of the two remaining work items:\n```text\n/work/active/9994249a-0ecb-4735-b078-19ce5c4ee20c\nfa69632a-023b-44db-93a8-173994fe936a\n/work/active/dff30574-fca6-45a0-a0dd-db36142b1e8e\n8b1da48a-7088-498a-9ae6-6245cdc870b1\n\n/work/registry/9994249a-0ecb-4735-b078-19ce5c4ee20c\ntesting=one\n/work/registry/dff30574-fca6-45a0-a0dd-db36142b1e8e\ntesting=four\n\n/work/workers/8b1da48a-7088-498a-9ae6-6245cdc870b1\n0000000001\n/work/workers/fa69632a-023b-44db-93a8-173994fe936a\n0000000001\n```","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzg%2Ftry-etcd-work-allocator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fitzg%2Ftry-etcd-work-allocator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzg%2Ftry-etcd-work-allocator/lists"}