{"id":17132197,"url":"https://github.com/bitsofinfo/kubernetes-helm-healthcheck-hook","last_synced_at":"2025-04-13T07:55:48.885Z","repository":{"id":43336029,"uuid":"188306607","full_name":"bitsofinfo/kubernetes-helm-healthcheck-hook","owner":"bitsofinfo","description":"Healthcheck script w/ Slack alerts, useful as a post-upgrade/install Helm Hook as a Kubernetes Job","archived":false,"fork":false,"pushed_at":"2022-03-07T15:11:27.000Z","size":592,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-13T07:55:44.867Z","etag":null,"topics":["continuous-delivery","continuous-integration","healthcheck","helm","kubernetes"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bitsofinfo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-23T21:06:16.000Z","updated_at":"2024-03-12T07:24:06.000Z","dependencies_parsed_at":"2022-09-10T08:00:16.285Z","dependency_job_id":null,"html_url":"https://github.com/bitsofinfo/kubernetes-helm-healthcheck-hook","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bitsofinfo%2Fkubernetes-helm-healthcheck-hook","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bitsofinfo%2Fkubernetes-helm-healthcheck-hook/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bitsofinfo%2Fkubernetes-helm-healthcheck-hook/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bitsofinfo%2Fkubernetes-helm-healthcheck-hook/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bitsofinfo","download_url":"https://codeload.github.com/bitsofinfo/kubernetes-helm-healthcheck-hook/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248681491,"owners_count":21144700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continuous-delivery","continuous-integration","healthcheck","helm","kubernetes"],"created_at":"2024-10-14T19:26:20.544Z","updated_at":"2025-04-13T07:55:48.859Z","avatar_url":"https://github.com/bitsofinfo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# kubernetes-helm-healthcheck-hook\n\nThis project provides a simple health checking script that can interrogate multiple\nURI paths for a given target FQDN, evaluate the responses, optionally send one\nor more alerts via Slack and then `exit` with an exit code of your choice if any\nof the checks fail.\n\n![diag](/doc/diag1.png \"Diagram1\")\n\n* [Background](#background)\n* [Install/Setup](#req)\n* [How it works](#how)\n* [Configuration](#config)\n* [Usage](#usage)\n* [Example: simple](#simple)\n* [Example: k8s helm hook](#hook)\n\n## \u003ca id=\"background\"\u003e\u003c/a\u003eBackground\n\nThis project came to be out of a need to cause Helm chart installs \u0026 upgrades against\nKubernetes to fail and appropriately alert DevOps when the installed/upgraded Chart\nyields a \"non-healthy\" result.\n\nIt can be easily adapted to existing Helm charts via a combination of a `ConfigMap` and\nand `Job` annotated as a [Helm Hook](https://github.com/helm/helm/blob/master/docs/charts_hooks.md)\n`helm.sh/hook` of type `post-install` or `post-upgrade` (or both)\n\nYou can also use it independently of Kubernetes / Helm as just a standalone utility.\n\n## \u003ca id=\"req\"\u003e\u003c/a\u003eInstall/Setup\n\nRun via Docker:\nhttps://hub.docker.com/r/bitsofinfo/kubernetes-helm-healthcheck-hook\n\nOtherwise:\n\n**Python 3.6+**\n\nDependencies: See [Dockerfile](Dockerfile)\n\n## \u003ca id=\"how\"\u003e\u003c/a\u003eHow it works\n\nThe main script is `checker.py`, which requires you to specify a `--target-root-url`\nplus at least one check defined in a *checks db YAML file* (example: [example/checksdb.yaml](example/checksdb.yaml)).\nEach check definition `path` gets invoked against the target root url.\n\nOptionally if you specify `--slack-config-filename`, each `alert` you define in the\nslack YAML file, will be executed per `checker.py` run. Example: (example: [example/slackconfig.yaml](example/slackconfig.yaml))\nEach alert you configure will be passed a Jinja2 context object that contains the full details of the `checker.py` invocation\nthat you can use to render about any message content you'd like. (To see a dump of what the context object looks like in STDOUT provide the `--debug-slack-jinja2-context` flag).\n\nThis can be run many ways such as:\n* a direct Python script invocation on your local (i.e. `./checker.py -h`)\n* via `docker run` using the [bitsofinfo/kubernetes-helm-healthcheck-hook](https://cloud.docker.com/repository/docker/bitsofinfo/kubernetes-helm-healthcheck-hook) image\n* as a Kubernetes `Job` configured as Helm post upgrade/install [Hook](https://github.com/helm/helm/blob/master/docs/charts_hooks.md)\n* ... or any other way you wish!\n\n## \u003ca id=\"config\"\u003e\u003c/a\u003eConfiguration\n\nThe `checker.py` script takes various [command line arguments](#usage) combined\nwith YAML configuration containing the \"checks\" to execute as well as optionally\na slack alert config for the \"alerts\" to send.\n\nConfiguration documentation is provided inline in the examples\n* [example/checksdb.yaml](example/checksdb.yaml) How you configure your \"checks\"\n* [example/slackconfig.yaml](example/slackconfig.yaml) How you configure your \"alerts\"\n\n## \u003ca id=\"simple\"\u003e\u003c/a\u003eSimple Example\n\nThe examples below use the sample config files located under [example](example/)\n\n```\ngit clone https://github.com/bitsofinfo/kubernetes-helm-healthcheck-hook.git\n\ncd kubernetes-helm-healthcheck-hook\n```\n\nLets process all the *checks* defined in [example/checksdb.yaml](example/checksdb.yaml)\nThis will exit with a `1` because ONE of the checks fails (i.e. GET to `/status/500`),\nit also sends 2 alerts as defined in the [example/slackconfig.yaml](example/slackconfig.yaml)\nhttps://bitsofinfo.slack.com/messages/CE46Z3TJA/ to the `#bitsofinfo-dev` channel ([self signup to channel](https://join.slack.com/t/bitsofinfo/shared_invite/enQtNjY1ODIzNTkyMDMyLTEzZGUwNzExOWYyMmZmMTQyYWZiYzJjYTJkNGI3MWMzNzQ3MTE2NzVhM2Q1ZjE4OGViYjA1NGY4MzdiZDg3ZWI)). We also\nspecify the `--extra-slack-context-props` argument which provides those values to our slack\nalert jinja2 template in [example/slackconfig.yaml](example/slackconfig.yaml).\n```\ndocker run -v `pwd`/example:/configs \\\n  bitsofinfo/kubernetes-helm-healthcheck-hook:latest checker.py \\\n  --target-root-url https://postman-echo.com \\\n  --any-check-fail-exit-code 1 \\\n  --checksdb-filename /configs/checksdb.yaml \\\n  --slack-config-filename /configs/slackconfig.yaml \\\n  --extra-slack-context-props key1=val,key2=val2,key3=x\n\necho \"Exit code was: $?\"\n```\n\nNow lets process all the *checks* defined in [example/checksdb.yaml](example/checksdb.yaml)\nEXCEPT those tagged with `fail`. This will exit with a `0` because none of the evaluated checks\nfailed. It will also only send 1 alert (success only), because the 2nd alert configured in\n[example/checksdb.yaml](example/slackconfig.yaml) only fires when a check is in failed state.\n```\ndocker run -v `pwd`/example:/configs \\\n  bitsofinfo/kubernetes-helm-healthcheck-hook:latest checker.py \\\n  --target-root-url https://postman-echo.com \\\n  --any-check-fail-exit-code 1 \\\n  --checksdb-filename /configs/checksdb.yaml \\\n  --slack-config-filename /configs/slackconfig.yaml \\\n  --tags-disqualifier fail \\\n  --extra-slack-context-props key1=val,key2=val2,key3=x -v\n\necho \"Exit code was: $?\"\n```\n\nLets process them all again with much more verbose debug output printed to STDOUT\nto let you start customizing your slack alert config and or refine your checks.\n```\ndocker run -v `pwd`/example:/configs \\\n  bitsofinfo/kubernetes-helm-healthcheck-hook:latest checker.py \\\n  --target-root-url https://postman-echo.com \\\n  --any-check-fail-exit-code 1 \\\n  --checksdb-filename /configs/checksdb.yaml \\\n  --slack-config-filename /configs/slackconfig.yaml \\\n  --verbose-output \\\n  --debug-slack-jinja2-context \\\n  --extra-slack-context-props key1=val,key2=val2,key3=x\n\necho \"Exit code was: $?\"\n```\n\n## \u003ca id=\"hook\"\u003e\u003c/a\u003eKubernetes Helm Hook Example\n\nLets say you deploy some custom app of your's with Helm and you'd like to follow\nit up with an immediate check to validate its working or not and alert on that. Well\nlets just use this project for that. (see [Helm Hooks docs](https://github.com/helm/helm/blob/master/docs/charts_hooks.md))\n\n1. Modify your app's Helm chart to generate an appropriate `ConfigMap` and `Job`\nproperly annotated as a Helm `helm.sh/hook` of type `post-install` or `post-upgrade` (or both).\n\n2. Now when you upgrade/install an app with your chart, your Helm status will properly\nreflect success or failure based on the exit code of the `checker.py` Job as well as\nsend you any alerts.\n\n3. When you delete your Helm release the ConfigMap will be purged, but the Job may or\nmay not remain depending on how you configure the `helm.sh/hook-delete-policy` annotation\n\n4. Remember since your Hook is a Kubernetes Job it solely reacts to exit codes of zero (0) or one (1)\nto determine success or failure. You also have full access to all k8s Job configuration options to\ntailor your retry behavior if desired (on top of the retry behavior you can configure in `checker.py`)\n\n5. Want to check more than just ONE `--target-root-url`? Then just generate multiple k8s Jobs.\n\nExample: your chart could now generate something like the below that interrogates anything you\nwish that points to the app your chart just deployed (i.e. you might check an `Ingress` pointing\nto the app your chart created.)\n\n```\n...\n---\ngenerate your app's Deployment...\n---\ngenerate your app's Service...\n---\ngenerate your app's version specific Ingress...\n\n# Here is our ConfigMap for our \"check\" + \"alert\" YAML configs\n---\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  name: {{ my-app-name }}-healthcheck-config\n  namespace: \"{{ my-namespace }}\"\ndata:\n  healthchecks.config.yaml: |\n    - path: \"/health\"\n      method: \"GET\"\n      timeout: 5\n      retries: 3\n  slackalerts.config.yaml: |\n    - name: \"Deployment Result\"\n      webhook_url: https://hooks.slack.com/services/xxxxxxxxx\n      template: \u003e\n        {\n          \"text\":\"*Deployment result: {{ target_root_url }}* {{ overall_result }}\"\n        }\n\n# Here is our Job that invokes checker.py using the ConfigMap above.\n---\napiVersion: batch/v1\nkind: Job\nmetadata:\n  name: \"{{ my-app-name }}--healthcheck\"\n  namespace: \"{{ my-namespace }}\"\n  annotations:\n    \"helm.sh/hook\": post-install,post-upgrade\n    \"helm.sh/hook-weight\": \"-5\"\n    \"helm.sh/hook-delete-policy\": before-hook-creation\nspec:\n  backoffLimit: 0 # retry 0 times, leverage checker.py built in retries\n  activeDeadlineSeconds: 600 # max run for 10 minutes (i.e. inclusive of retries)\n  template:\n    metadata:\n      name: \"{{ my-namespace }}-healthcheck\"\n    spec:\n      volumes:\n        - name: hc-config-volume\n          configMap:\n            name: {{ my-namespace }}-healthcheck-config\n      containers:\n        - name: {{ my-namespace }}-healthcheck\n          image: \"bitsofinfo/kubernetes-helm-healthcheck-hook:0.1.1\"\n          restartPolicy: Never\n          volumeMounts:\n            - name: hc-config-volume\n              mountPath: /etc/checker\n          command:\n            - \"checker.py\"\n          args:\n            - \"--target-root-url\"\n            - \"https://{{ my-app-ingress-fqdn }}\"\n            - \"--max-retries\"\n            - \"30\"\n            - \"--sleep-seconds\"\n            - \"10\"\n            - \"--any-check-fail-exit-code\"\n            - \"1\"\n            - \"--checksdb-filename\"\n            - \"/etc/checker/healthchecks.config.yaml\"\n            - \"--slack-config-filename\"\n            - \"/etc/checker/slackalerts.config.yaml\"\n\n```\n\n## \u003ca id=\"usage\"\u003e\u003c/a\u003eUsage\n\nFor the formats of the YAML `--checksdb-filename` and `--slack-config-filename` config files see:\n* [example/checksdb.yaml](example/checksdb.yaml) How you configure your \"checks\"\n* [example/slackconfig.yaml](example/slackconfig.yaml) How you configure your \"alerts\"\n\n```\n$\u003e./checker.py -h\n\nusage: checker.py [-h] [-u TARGET_ROOT_URL] [-i CHECKSDB_FILENAME]\n                  [-a SLACK_CONFIG_FILENAME] [-o OUTPUT_FILENAME]\n                  [-f OUTPUT_FORMAT] [-v] [-q TAGS_QUALIFIER]\n                  [-d TAGS_DISQUALIFIER] [-r MAX_RETRIES] [-n CHECK_NAME]\n                  [-t THREADS] [-s SLEEP_SECONDS] [-l LOG_LEVEL] [-b LOG_FILE]\n                  [-z] [-x ANY_CHECK_FAIL_EXIT_CODE] [-D]\n                  [-e EXTRA_SLACK_CONTEXT_PROPS]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -u TARGET_ROOT_URL, --target-root-url TARGET_ROOT_URL\n                        Required Target root URL (i.e. http[s]://whatever.com)\n                        where all checks defined in --checksdb-filename will\n                        execute against. Each check 'path' defined in\n                        --checksdb-filename will be APPENDED to this value.\n  -i CHECKSDB_FILENAME, --checksdb-filename CHECKSDB_FILENAME\n                        Required: Filename (YAML) of checks database that will\n                        be executed against the --target-root-url, default:\n                        'checksdb.yaml'\n  -a SLACK_CONFIG_FILENAME, --slack-config-filename SLACK_CONFIG_FILENAME\n                        Optional: Filename (YAML) containing the slack alert\n                        configuration. default: None\n  -o OUTPUT_FILENAME, --output-filename OUTPUT_FILENAME\n                        Optional: The result of the checks will be written to\n                        this output filename, default: None\n  -f OUTPUT_FORMAT, --output-format OUTPUT_FORMAT\n                        Output format: json or yaml, default 'json'\n  -v, --verbose-output  The result output will be in verbose mode, containing\n                        much more detail helpful in debugging. Default OFF\n  -q TAGS_QUALIFIER, --tags-qualifier TAGS_QUALIFIER\n                        Optional, only include 'checks' loaded in --checksdb-\n                        filename whos 'tags' attribute contains ONE or MORE\n                        values this comma delimited list of tags\n  -d TAGS_DISQUALIFIER, --tags-disqualifier TAGS_DISQUALIFIER\n                        Inverse of --tags-qualifier. Exclude 'checks' loaded\n                        in --checksdb-filename whos 'tags' attribute contains\n                        ONE or MORE values this comma delimited list of tags\n  -r MAX_RETRIES, --max-retries MAX_RETRIES\n                        Maximum retries per check, overrides those provided in\n                        --checksdb-filename, default 100\n  -n CHECK_NAME, --check-name CHECK_NAME\n                        Optional descriptive name for this invocation, default\n                        'no --job-name specified'\n  -t THREADS, --threads THREADS\n                        max threads for processing checks listed in\n                        --checksdb-filename, default 1, higher = faster\n                        completion, adjust as necessary to avoid DOSing...\n  -s SLEEP_SECONDS, --sleep-seconds SLEEP_SECONDS\n                        The MAX amount of time to sleep between all attempts\n                        for each service check; if \u003e 0, the actual sleep will\n                        be a RANDOM time from 0 to this value. Default 0\n  -l LOG_LEVEL, --log-level LOG_LEVEL\n                        log level, default DEBUG\n  -b LOG_FILE, --log-file LOG_FILE\n                        Path to log file, default None, STDOUT\n  -z, --stdout-result   Print check results to STDOUT in addition to --output-\n                        filename on disk (if specified)\n  -x ANY_CHECK_FAIL_EXIT_CODE, --any-check-fail-exit-code ANY_CHECK_FAIL_EXIT_CODE\n                        If ANY single check defined in --checksdb-filename\n                        fails or a general program error occurs, force a\n                        sys.exit(your-provided-exit-code). If all checks are\n                        successful the exit code will be 0. Default 1\n  -D, --debug-slack-jinja2-context\n                        Dumps a JSON debug output of the jinja2 object passed\n                        to the Slack jinja2 template\n  -e EXTRA_SLACK_CONTEXT_PROPS, --extra-slack-context-props EXTRA_SLACK_CONTEXT_PROPS\n                        Optional comma delimited of key=value,key2=value pairs\n                        that will be added to the 'context' object passed to\n                        the Slack Alert jinja2 templates under the key\n                        'extra_props'\n  -R, --verbose-debug-requests\n                        Verbosely debugs every request/response made to --log-\n                        file\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbitsofinfo%2Fkubernetes-helm-healthcheck-hook","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbitsofinfo%2Fkubernetes-helm-healthcheck-hook","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbitsofinfo%2Fkubernetes-helm-healthcheck-hook/lists"}