https://github.com/vkuznet/podmanager
PodManager service to manage k8s pods based on alerts provided by AlertManager
https://github.com/vkuznet/podmanager
Last synced: about 1 month ago
JSON representation
PodManager service to manage k8s pods based on alerts provided by AlertManager
- Host: GitHub
- URL: https://github.com/vkuznet/podmanager
- Owner: vkuznet
- License: mit
- Created: 2021-08-24T12:14:59.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-10T12:48:36.000Z (about 2 years ago)
- Last Synced: 2025-03-30T08:28:41.882Z (about 2 months ago)
- Language: Go
- Size: 21.5 KB
- Stars: 1
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PodManager
PodManager service to manage k8s pods based on alert information provided by AlertManager.
The flow of the PodManager is following:
- it fetches all active alerts from AlertManager
- it matched its rules with alerts
- if matched alert is found it will perform an action (provided by the rule) on
a given pod within its namespace, e.g. delete and restart the podThe rules are defined as following:
- `name` is used to match alert name
- `namespace` defines k8s namespace
- `pod` is used to identify pod value from alert attributes
- `env` defines k8s environment
- `action` defines which action to apply for a given podHere is an example of configuration:
```
{
"alert_manager": "http://alert-manager.url",
"interval": 10,
"rules": [
{"name": "service is down", "namespace": "xxx", "pod": "apod", "action": "restart", "env": "k8s-prod"},
{"name": "number of workflows is high", "namespace": "xyz", "pod": "apod", "action": "print"}
],
"verbose": 1
}
```
The `interval` defines periodicity of the service checks with given
AlertManager. The rules define the alert name, namespace, env, pod attribute in
alert, and appropriate action. The first rule will watch for alert with
`service is down` name within env, if found, it will use `apod` attribute of alert
to fetch the pod name to use, and it will apply `restart` action on that pod
within given namespace.