{"id":40790341,"url":"https://github.com/x-cellent/go-dags","last_synced_at":"2026-01-21T20:03:31.226Z","repository":{"id":142309841,"uuid":"374700871","full_name":"x-cellent/go-dags","owner":"x-cellent","description":"Example for using DAGs to build a reconciliation-flow in go","archived":false,"fork":false,"pushed_at":"2021-06-07T16:03:25.000Z","size":41,"stargazers_count":7,"open_issues_count":0,"forks_count":3,"subscribers_count":9,"default_branch":"main","last_synced_at":"2024-06-20T10:18:23.478Z","etag":null,"topics":["dag","example","go","golang"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/x-cellent.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-07T14:45:12.000Z","updated_at":"2024-06-10T07:58:48.000Z","dependencies_parsed_at":"2023-06-09T05:30:22.682Z","dependency_job_id":null,"html_url":"https://github.com/x-cellent/go-dags","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/x-cellent/go-dags","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/x-cellent%2Fgo-dags","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/x-cellent%2Fgo-dags/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/x-cellent%2Fgo-dags/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/x-cellent%2Fgo-dags/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/x-cellent","download_url":"https://codeload.github.com/x-cellent/go-dags/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/x-cellent%2Fgo-dags/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28641293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-21T18:04:35.752Z","status":"ssl_error","status_checked_at":"2026-01-21T18:03:55.054Z","response_time":86,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dag","example","go","golang"],"created_at":"2026-01-21T20:03:30.525Z","updated_at":"2026-01-21T20:03:31.211Z","avatar_url":"https://github.com/x-cellent.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reconciling object state using DAG\n\nIn distributed, asynchronous systems, state management has its challenges,\nbecause as you might know: \"Everything Fails All the Time\" (Werner Vogels).\nTo ensure a consistent async state update in a distributed system, a component must\nreconcile the desired state, i.e. check if the outcome of their previous state-manipulating\nrequest to another component corresponds to the desired state.  If this is not the\ncase, the component repeats the update requests and checks again. \n\nKubernetes represents an asynchronous, distributed state manager. You declare the\ndesired state for a component instance, and kubernetes tries to create and maintain\nthat state should anything change.\n\n\"Reconcile\" is the basic pattern in kubernetes where controllers ensure that,\nfor any given object, the actual state of the world (both the cluster state,\nand potentially external state e.g. loadbalancers for a cloud provider)\nmatches the desired state declared in the object.\n\nIf the desired state is \"simple\", like create one loadbalancer with this config (spec),\none can easily imagine, how to reconcile that: we try to create the loadbalancer\nwith a call to an external API and if this fails, we queue this task to try again later.\n\nBut what if you have to reconcile desired state with an external API that\nconsists of a bunch of objects that have to be created in a specific order,\ni.e. that have dependencies among each other?\n\nIn our case we had to deal with a legacy Identity and Access Management System\nwhich API must be handled with care. Our flow would first create two groups,\nnest them, then create a new role and finally assign the groups to the new role.\nThere are fine grained SOAP-Services for all of the tasks, but each of\nthe services can be temporarily unavailable or fail individually.\nFor deletion, the whole process must be executed in reverse.\n\nOf course, we could create the objects arbitrarily with a \"brute force\" approach\nand retry creation until all errors have vanished. This simple approach is not always\nappropriate, and in certain cases unacceptable. If you have to call APIs\nof shaky legacy subsystems for example, this careless strategy might cause severe\nproblems, even if it is only the legacy system's ops guy that comes along\nasking why his logs are full of business errors and what the heck you are doing.\nSo sometimes, we need to go a more focused route.\n\n## DAGs to the rescue\n\nTo model the dependencies between the required tasks we need a data structure\nand an algorithm to calculate the right order in which to execute the tasks.\n\nThis is when DAGs come into play. A directed acyclic graph (DAG) is a directed graph\nwith no directed cycles, which allows us to model tasks (as vertices) and dependencies\n(as directed edges) in a \"natural way\".\n\nThe following simple example DAG models the following dependencies (\"V\" = vertex):\n* V1 depends on V2, V3, V4 and V5\n* V2 has no dependencies\n* V3 depends on V2 and V5\n* V4 depends on V5\n* V5 depends on V2\n\n![DAG](dag_u.png)\n\nEvery DAG has at least one \"topological ordering\", i.e. a sequence of the vertices such that\nevery edge is directed from earlier to later in the sequence.\n\n![Topological ordering](dag-topo_u.png)\n\nValid topological ordering of our DAG are: \"V1 V3 V4 V5 V2\" or \"V1 V4 V3 V5 V2\". The sequence\nshows that task \"V1\" heavily depends on other tasks, where \"V2\" has no dependencies.\n\nThe correct sequences of tasks to resolve the dependencies is the reverse topological\nordering: \"V2 V5 V3 V4 V1\" or \"V2 V5 V4 V3 V1\". We start with task \"V2\" that has\nno outgoing dependencies we can execute the tasks in the sequence of the topological sort\nin \"ascending direction\", i.e. reversing the sort order.\n\nIn our code we can solve this reversing of tasks by the following strategies:\n1. we calculate the topological order of our dependency tree and reverse the order\n1. we reverse the edge direction when adding an edge to the graph so that the edge-direction\n   models a \"must come before\" relation instead of \"depends on\"\n\nTo get rid of ambiguous equivalent topological task-sequences (e.g. for unit-tests) we can\nstabilize the ordering by adding a second sort-criteria like lexical ordering by vertex id.\n\nDAGs are featured in many projects that share have to solve our problem of executing\na series of tasks in the right order: [Terraform](https://github.com/hashicorp/terraform/tree/main/internal/dag), [Gitlab CI](https://docs.gitlab.com/ee/ci/directed_acyclic_graph/), [Gardener](https://github.com/gardener/gardener/tree/master/pkg/utils/flow).\nThese systems use internal packages, so we have to bring up something lightweight on our own.\n\n## DAGs with Go\n\nNow that we know what we need as data-structure and algorithm, we need to figure out how to\nimplement this in Go.\nWe can use the [gonum-library](https://github.com/gonum/gonum) that has a vast set of numeric\nalgorithms and also features a [graph package](https://pkg.go.dev/gonum.org/v1/gonum/graph).\nThis library has much more than we need, of course, but for the sake of only showing the\napplication of DAGs to the problem we want to solve (and not implementing DAG ourselves),\nlet's go with it for now.\n\nThe code that models our example graph in go is straightforward. We simply create a directed graph\nand some nodes and add them with directed edges to the graph.\nAfter that we are able to calculate a stabilized topological sort (second param 'nil' defaults\nto lexical sort):\n\n```\n// ex1_graph\npackage main\n\nimport (\n\t\"fmt\"\n\t\"gonum.org/v1/gonum/graph/simple\"\n\t\"gonum.org/v1/gonum/graph/topo\"\n)\n\nfunc main() {\n\tg := simple.NewDirectedGraph()\n\tn1 := simple.Node(1)\n\tn2 := simple.Node(2)\n\tn3 := simple.Node(3)\n\tn4 := simple.Node(4)\n\tn5 := simple.Node(5)\n\tg.SetEdge(g.NewEdge(n1, n2))\n\tg.SetEdge(g.NewEdge(n1, n3))\n\tg.SetEdge(g.NewEdge(n1, n4))\n\tg.SetEdge(g.NewEdge(n1, n5))\n\tg.SetEdge(g.NewEdge(n3, n2))\n\tg.SetEdge(g.NewEdge(n3, n5))\n\tg.SetEdge(g.NewEdge(n4, n5))\n\tg.SetEdge(g.NewEdge(n5, n2))\n\n\tnodes, _ := topo.SortStabilized(g, nil)\n\n\t// reverse result\n\tfor i, j := 0, len(nodes)-1; i \u003c j; i, j = i+1, j-1 {\n\t\tnodes[i], nodes[j] = nodes[j], nodes[i]\n\t}\n\tfmt.Printf(\"%v\\n\", nodes)\n}\n```\n\nNow we are all set to use the DAG to model the dependencies as graph and determine\nthe required execution order of tasks taking the dependencies into account.\n\n## Reconcile object state\n\nThe following fragments of the flow-package (/pkg/flow) illustrate, how we can build\na reconciliation workflow on top of a DAG.\n\nThe nodes of the graph resemble the tasks that have to be executed, the edges are dependencies\nbetween those tasks. In order to be able to reconcile the desired state that is modeled by our graph,\nwe need to create a companion data structure and a contract for the reconcile-tasks.\n\n```\n// Fn is where the task's logic must be placed.\n// If the task is successful, it must return nil.\n// If the task returns a FatalError, it indicates that it cannot be retried.\n// The task can indicate by returning any other error, that it can be retried.\ntype Fn func(ctx context.Context, task *Task) error\n\ntype Task struct {\n\tid   int64\n\tdesc string\n\treconcileFn   Fn\n}\n```\n\nThe Workflow consists of a DAG that models the dependencies and associated Tasks for each node of the graph.\n\n```\ntype Workflow struct {\n\t// DAG\n\tgraph *simple.DirectedGraph\n\t// associated Tasks, key is nodeID\n\ttasks map[int64]*Task\n}\n```\n\nWe need some methods to add Tasks to the Workflow and define Dependencies between Tasks. Errors can occur\nwhen the dependencies violate the DAG-property of the graph, i.e. if cyclic dependencies are added.\nIf we add a dependency from V5 to V1 to our example graph, a call to topo.SortStabilized will result in\n\"topo: no topological ordering: cyclic components: [[1 3 4 5]]\", where \"[1 3 4 5]\" shows the members\nof the cyclic component.\n\nEach reconcile traverses the stabilized topological task-sequence and calls the Tasks, so that each task\ncan reconcile it's state with the legacy-system and make adjustments if necessary.\n\n```\n// Reconcile executes the workflow tasks in order and returns nil, if all tasks completed successfully.\n// If a FatalError is returned, the workflow failed and cannot be retried.\nfunc (w *Workflow) Reconcile(ctx context.Context) error {\n\ttasks, err := w.GetOrderedTasks()\n\tif err != nil {\n\t\treturn FatalError{ err: err\t}\n\t}\n\n\tfor _, task := range tasks {\n\t\tif cancelErr := ctx.Err(); cancelErr == nil {\n\t\t\terr := task.reconcileFn(ctx, task)\n\t\t\t// the workflow runs unless some task returns an error\n\t\t\tif err != nil {\n\t\t\t\treturn err\n\t\t\t}\n\t\t}\n\t}\n\treturn nil\n}\n```\n\nIt is the duty of the caller of reconcile to determine if the workflow ended successfully or with\na fatal or retryable error and act accordingly.\n\nIn the example, the reconcile-loop repeats the reconciliation after 5 seconds as long as the workflow\ndoes not complete successfully or ends with fatal error.\n\n```\n\tctx := context.Background()\n\terr := w.Reconcile(ctx)\n\tfor ; err != nil; {\n\t\tvar fatalErr flow.FatalError\n\t\tif errors.As(err, \u0026fatalErr) {\n\t\t\tlog.Fatalf(fatalErr.Error())\n\t\t} else {\n\t\t\tlog.Println(err)\n\t\t}\n\t\t// retry after some time\n\t\ttime.Sleep(5 * time.Second)\n\t\terr = w.Reconcile(ctx)\n\t}\n```\n\nIn the flow-example, each task will fail a configurable number of times before it signals success.\nWe can watch, how the workflow reconciles each task in the right order, in multiple reconciliation\nruns until the desired state is reached.\n\n```\n2021/05/21 18:51:48 task 2 (create V2) \u003e\u003e task 5 (create V5) \u003e\u003e task 3 (create V3) \u003e\u003e task 4 (create V4) \u003e\u003e task 1 (create V1)\n2021/05/21 18:51:48 --- reconcile run 1 ---\n2021/05/21 18:51:48 reconcile task 2 (create V2) success          \u003c-- task 2 completed successfully\n2021/05/21 18:51:48 reconce task 5 (create V5) error, 1 remain    \u003c-- task 5 failed in this run, workflow is not complete\n2021/05/21 18:51:50 --- reconcile run 2 ---                       \u003c-- after two seconds, the next run starts\n2021/05/21 18:51:50 reconcile task 2 (create V2) ok               \u003c-- task 2 is still ok\n2021/05/21 18:51:50 reconcile task 5 (create V5) success          \u003c-- task 5 completed successfully\n2021/05/21 18:51:50 reconcile task 3 (create V3) success          \u003c-- task 3 completed successfully\n2021/05/21 18:51:50 reconce task 4 (create V4) error, 1 remain    \u003c-- task 4 failed in this run, workflow is not complete\n2021/05/21 18:51:52 --- reconcile run 3 ---\n2021/05/21 18:51:52 reconcile task 2 (create V2) ok\n2021/05/21 18:51:52 reconcile task 5 (create V5) ok\n2021/05/21 18:51:52 reconcile task 3 (create V3) ok\n2021/05/21 18:51:52 reconcile task 4 (create V4) success          \u003c-- task 4 completed successfully this time\n2021/05/21 18:51:52 reconce task 1 (create V1) error, 2 remain    \u003c-- task 1 failed, workflow not complete\n2021/05/21 18:51:54 --- reconcile run 4 ---\n2021/05/21 18:51:54 reconcile task 2 (create V2) ok\n2021/05/21 18:51:54 reconcile task 5 (create V5) ok\n2021/05/21 18:51:54 reconcile task 3 (create V3) ok\n2021/05/21 18:51:54 reconcile task 4 (create V4) ok\n2021/05/21 18:51:54 reconce task 1 (create V1) error, 1 remain    \u003c-- task 1 failed again, workflow not complete\n2021/05/21 18:51:56 --- reconcile run 5 ---\n2021/05/21 18:51:56 reconcile task 2 (create V2) ok\n2021/05/21 18:51:56 reconcile task 5 (create V5) ok\n2021/05/21 18:51:56 reconcile task 3 (create V3) ok\n2021/05/21 18:51:56 reconcile task 4 (create V4) ok\n2021/05/21 18:51:56 reconcile task 1 (create V1) success          \u003c-- task 1 completed successfully, the workflow is complete\n```\n\n## Final thoughts\n\nWe have explored how DAGs can help us to model dependencies between tasks and determine a correct\nexecution order for the tasks. Using the gonum-library, we can create and use DAGs in Go.\nWith some additional code we have built a tiny workflow package that allows us to execute\ninterdependent tasks in the right order.\n\nIn conjunction with the task contract and a reconciliation loop we can now reconcile complex\nstate in a caring and focused manner and be polite to our legacy-systems.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fx-cellent%2Fgo-dags","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fx-cellent%2Fgo-dags","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fx-cellent%2Fgo-dags/lists"}