Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/keikoproj/governor
A collection of cluster reliability tools for Kubernetes
https://github.com/keikoproj/governor
auto-healing aws eks eks-cluster kubernetes kubernetes-cluster kubernetes-node kubernetes-pod kubernetes-tools self-healing
Last synced: 2 months ago
JSON representation
A collection of cluster reliability tools for Kubernetes
- Host: GitHub
- URL: https://github.com/keikoproj/governor
- Owner: keikoproj
- License: apache-2.0
- Created: 2019-08-07T22:23:38.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-08-24T06:09:45.000Z (5 months ago)
- Last Synced: 2024-08-24T07:22:34.312Z (5 months ago)
- Topics: auto-healing, aws, eks, eks-cluster, kubernetes, kubernetes-cluster, kubernetes-node, kubernetes-pod, kubernetes-tools, self-healing
- Language: Go
- Homepage:
- Size: 5.38 MB
- Stars: 118
- Watchers: 17
- Forks: 26
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
- awesome-cloud-native - governor - A collection of cluster reliability tools for Kubernetes. (OPS)
README
# governor
[![Master Branch](https://github.com/keikoproj/governor/actions/workflows/push.yaml/badge.svg)](https://github.com/keikoproj/governor/actions/workflows/push.yaml)
[![Last PR](https://github.com/keikoproj/governor/actions/workflows/unit-test.yaml/badge.svg)](https://github.com/keikoproj/governor/actions/workflows/unit-test.yaml)[![codecov](https://codecov.io/gh/keikoproj/governor/branch/master/graph/badge.svg)](https://codecov.io/gh/keikoproj/governor)
[![Go Report Card](https://goreportcard.com/badge/github.com/keikoproj/governor)](https://goreportcard.com/report/github.com/keikoproj/governor)> A collection of cluster reliability tools built for Kubernetes
Governor is a collection of tools for improving the stability of the large Kubernetes clusters as a single Docker image.
Three common problems observed in large Kubernetes clusters are:
1. Node failure due to underlying cloud provider issues.
2. Pods being stuck in "Terminating" state and unable to be cleaned up.
3. Pod Disruption Budgets (PDBs) blocking node rotation/drain.
4. AWS data paths being blocked due to AZ failure.**node-reaper** provides the capability for worker nodes to be force terminated so that replacement ones come up.
**pod-reaper** does a force termination of pods stuck in Terminating state for a certain amount of time.In some cases, on multi-tenant platforms, users can own PDBs which block node rotation/drain if they are misconfigured, pods are in crashloop backoff, or if multiple PDBs are targeting the same pods.
**pdb-reaper** provides the capability for detecting PDBs in these conditions, and deleting them. This is especially useful for pre-production environments where pods might be left around in crashloop, or misconfigured PDBs may exist blocking node drains.
When pdb-reaper deletes PDBs, it does **NOT** recreate them, this is useful when GitOps is used. We recommend using pdb-reaper in non-production environments or use the `--dry-run` flag to have event publishing without deletion of PDBs.
There are many corner-cases where deleting PDBs might be dangerous, please consider such cases when using pdb-reaper.
**cordon** provides capabilities around cordoning specific data paths in AWS, for example excluding a specific NAT Gatway in case of AZ failure.
## Usage
Assuming an AWS-hosted running Kubernetes cluster:
```sh
kubectl create namespace governor# Using a CronJob
kubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/node-reaper.yamlkubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/pod-reaper.yaml
kubectl apply -n governor -f https://raw.githubusercontent.com/keikoproj/governor/master/examples/pdb-reaper.yaml
```### Available Packages
| Package | Description | Docs |
| :---------- | :---------------------------------- | :---------------------------------------------: |
| node-reaper | terminates nodes in scaling groups | [node-reaper](pkg/reaper/README.md#node-reaper) |
| pod-reaper | force terminates stuck pods | [pod-reaper](pkg/reaper/README.md#pod-reaper) |
| pdb-reaper | deletes blocking PDBs | [pdb-reaper](pkg/reaper/README.md#pdb-reaper) |
| cordon | helps with cordoning AWS data paths | [cordon](pkg/cordon/README.md#cordon) |## Release History
Please see [CHANGELOG.md](.github/CHANGELOG.md).
## ❤ Contributing ❤
Please see [CONTRIBUTING.md](.github/CONTRIBUTING.md).
## Developer Guide
Please see [DEVELOPER.md](.github/DEVELOPER.md).