Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/juliocesarscheidt/kube-pod-metrics-collector


https://github.com/juliocesarscheidt/kube-pod-metrics-collector

aws azure-devops cloudwatch kubernetes python python-kubernetes yaml-lint

Last synced: 11 days ago
JSON representation

Awesome Lists containing this project

README

        

# Kube Pod Metrics Collector Logo



GitHub License


Docker pulls


Image size


We are using the Kubernetes API to retrieve all the pods, then we iterate over them to check their statuses, and when the pod is failed, or pending for more than X minutes (the "X" minutes is an option passed through variable), we increment our metric of crashed pods by namespace, to send it later to CloudWatch as a custom metric where we could better analyse the information and create some alerts on it.

When running as a pod inside a Kubernetes cluster, we are going to use a service account, that it will give a bearer token to call the Kubernetes API in a transparent fashion.

When running as a container it is required to pass a kubeconfig file to interact with some cluster.

## Prerequisites

- The Kubernetes API used must be accessible from the location where this pod/container is running.

- In order to send metrics to CloudWatch it is required an user with credentials for that, more instructions on how to create this user here: [Create CloudWatch User](./cloudwatch-user.md)

## Instructions

> Running as container

```bash
# build image
docker image build -t docker.io/juliocesarmidia/kube-pod-metrics-collector:v1.0.0 ./src

# or pull from docker hub
docker image pull docker.io/juliocesarmidia/kube-pod-metrics-collector:v1.0.0

# run without sending metrics to CloudWatch - dry run
docker container run --rm -d \
--name pod-metrics \
--restart 'no' \
--network host \
-e RUNNING_IN_KUBERNETES='0' \
-e SCHEDULE_SECONDS_INTERVAL='60' \
-e PENDING_MINS_TO_BE_CRASHED='1' \
-e IGNORE_NAMESPACES='kube-public,kube-node-lease' \
-e SEND_TO_CLOUDWATCH='0' \
-e KUBECONFIG='/root/.kube/config' \
-e KUBECONTEXT=$(kubectl config current-context) \
-v $HOME/.kube/config:$HOME/.kube/config \
docker.io/juliocesarmidia/kube-pod-metrics-collector:v1.0.0

# logs and stats
docker container logs -f pod-metrics
docker stats pod-metrics

# run sending metrics to CloudWatch (it requires AWS credentials)
docker container run --rm -d \
--name pod-metrics \
--restart 'no' \
--network host \
-e RUNNING_IN_KUBERNETES='0' \
-e SCHEDULE_SECONDS_INTERVAL='60' \
-e PENDING_MINS_TO_BE_CRASHED='1' \
-e IGNORE_NAMESPACES='kube-public,kube-node-lease' \
-e SEND_TO_CLOUDWATCH='1' \
-e AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY \
-e AWS_DEFAULT_REGION \
-e KUBECONFIG='$HOME/.kube/config' \
-e KUBECONTEXT=$(kubectl config current-context) \
-v $HOME/.kube/config:/root/.kube/config \
docker.io/juliocesarmidia/kube-pod-metrics-collector:v1.0.0

# clean up
docker container rm -f pod-metrics
```

> Running inside Kubernetes as pod

```bash
# create a secret for CloudWatch sdk usage, with the AWS credentials
kubectl apply -f - <