{"id":13839374,"url":"https://github.com/keikoproj/governor","last_synced_at":"2026-01-11T22:51:23.176Z","repository":{"id":35172228,"uuid":"201136151","full_name":"keikoproj/governor","owner":"keikoproj","description":"A collection of cluster reliability tools for Kubernetes","archived":false,"fork":false,"pushed_at":"2024-08-24T06:09:45.000Z","size":5639,"stargazers_count":118,"open_issues_count":14,"forks_count":26,"subscribers_count":17,"default_branch":"master","last_synced_at":"2024-08-24T07:22:34.312Z","etag":null,"topics":["auto-healing","aws","eks","eks-cluster","kubernetes","kubernetes-cluster","kubernetes-node","kubernetes-pod","kubernetes-tools","self-healing"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/keikoproj.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-07T22:23:38.000Z","updated_at":"2024-08-24T06:09:47.000Z","dependencies_parsed_at":"2024-01-20T21:47:01.719Z","dependency_job_id":"86a7916f-efe5-450f-aa30-f12005416b94","html_url":"https://github.com/keikoproj/governor","commit_stats":null,"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keikoproj%2Fgovernor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keikoproj%2Fgovernor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keikoproj%2Fgovernor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keikoproj%2Fgovernor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/keikoproj","download_url":"https://codeload.github.com/keikoproj/governor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225675073,"owners_count":17506273,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["auto-healing","aws","eks","eks-cluster","kubernetes","kubernetes-cluster","kubernetes-node","kubernetes-pod","kubernetes-tools","self-healing"],"created_at":"2024-08-04T17:00:20.918Z","updated_at":"2026-01-11T22:51:23.171Z","avatar_url":"https://github.com/keikoproj.png","language":"Go","funding_links":[],"categories":["OPS"],"sub_categories":[],"readme":"# governor\n\n[![Master Branch](https://github.com/keikoproj/governor/actions/workflows/push.yaml/badge.svg)](https://github.com/keikoproj/governor/actions/workflows/push.yaml)\n[![Last PR](https://github.com/keikoproj/governor/actions/workflows/unit-test.yaml/badge.svg)](https://github.com/keikoproj/governor/actions/workflows/unit-test.yaml)\n\n[![codecov](https://codecov.io/gh/keikoproj/governor/branch/master/graph/badge.svg)](https://codecov.io/gh/keikoproj/governor)\n[![Go Report Card](https://goreportcard.com/badge/github.com/keikoproj/governor)](https://goreportcard.com/report/github.com/keikoproj/governor)\n\n\u003e A collection of cluster reliability tools built for Kubernetes\n\nGovernor is a collection of tools for improving the stability of the large Kubernetes clusters as a single Docker image.\n\nThree common problems observed in large Kubernetes clusters are:\n\n1. Node failure due to underlying cloud provider issues.\n2. Pods being stuck in \"Terminating\" state and unable to be cleaned up.\n3. Pod Disruption Budgets (PDBs) blocking node rotation/drain.\n4. AWS data paths being blocked due to AZ failure.\n\n**node-reaper** provides the capability for worker nodes to be force terminated so that replacement ones come up.\n**pod-reaper** does a force termination of pods stuck in Terminating state for a certain amount of time.\n\nIn some cases, on multi-tenant platforms, users can own PDBs which block node rotation/drain if they are misconfigured, pods are in crashloop backoff, or if multiple PDBs are targeting the same pods.\n\n**pdb-reaper** provides the capability for detecting PDBs in these conditions, and deleting them. This is especially useful for pre-production environments where pods might be left around in crashloop, or misconfigured PDBs may exist blocking node drains.\n\nWhen pdb-reaper deletes PDBs, it does **NOT** recreate them, this is useful when GitOps is used. We recommend using pdb-reaper in non-production environments or use the `--dry-run` flag to have event publishing without deletion of PDBs.\n\nThere are many corner-cases where deleting PDBs might be dangerous, please consider such cases when using pdb-reaper.\n\n**cordon** provides capabilities around cordoning specific data paths in AWS, for example excluding a specific NAT Gatway in case of AZ failure.\n\n## Usage\n\nAssuming an AWS-hosted running Kubernetes cluster:\n\n```sh\nkubectl create namespace governor\n\n# Using a CronJob\nkubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/node-reaper.yaml\n\nkubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/pod-reaper.yaml\n\nkubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/pdb-reaper.yaml\n```\n\n### Available Packages\n\n| Package     | Description                         | Docs                                            |\n| :---------- | :---------------------------------- | :---------------------------------------------: |\n| node-reaper | terminates nodes in scaling groups  | [node-reaper](pkg/reaper/README.md#node-reaper) |\n| pod-reaper  | force terminates stuck pods         | [pod-reaper](pkg/reaper/README.md#pod-reaper)   |\n| pdb-reaper  | deletes blocking PDBs               | [pdb-reaper](pkg/reaper/README.md#pdb-reaper)   |\n| cordon      | helps with cordoning AWS data paths | [cordon](pkg/cordon/README.md#cordon)           |\n\n## Release History\n\nPlease see [CHANGELOG.md](.github/CHANGELOG.md).\n\n## ❤ Contributing ❤\n\nPlease see [CONTRIBUTING.md](.github/CONTRIBUTING.md).\n\n## Developer Guide\n\nPlease see [DEVELOPER.md](.github/DEVELOPER.md).\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeikoproj%2Fgovernor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkeikoproj%2Fgovernor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeikoproj%2Fgovernor/lists"}