https://github.com/zephinzer/healthcheckr

Simple but effective healthcheck tool with alerting
https://github.com/zephinzer/healthcheckr
Last synced: 2 months ago
JSON representation
Simple but effective healthcheck tool with alerting
Host: GitHub
URL: https://github.com/zephinzer/healthcheckr
Owner: zephinzer
License: mit
Created: 2024-03-29T09:47:01.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-06-05T15:31:45.000Z (12 months ago)
Last Synced: 2025-02-01T19:13:55.134Z (4 months ago)
Language: Go
Size: 44.9 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # Healthcheckr

- [Healthcheckr](#healthcheckr)

- [Usage](#usage)

  - [Job mode](#job-mode)

  - [Worker mode](#worker-mode)

    - [Worker configuration documentation](#worker-configuration-documentation)

      - [Configuration properties](#configuration-properties)

      - [Channel type configuration](#channel-type-configuration)

  - [Debug mode](#debug-mode)

    - [Telegram chat ID retrieval](#telegram-chat-id-retrieval)

    - [Utility HTTP server](#utility-http-server)

- [Development](#development)

  - [Getting started](#getting-started)

  - [Executing via `go`](#executing-via-go)

- [License](#license)

# Usage

## Job mode

To run `healthcheckr` in job mode:

```sh

# with defaults

healthcheckr verify http;

# with all available flags

healthcheckr verify http \

  --expect-body-regex 'google' \

  --expect-response-time-ms 1000 \

  --expect-status-code 200 \

  --log-level 4 \

  --use-hostname yahoo.com \

  --use-method get \

  --use-path / \

  --use-query abc=def \

  --use-query ghi=jkl \

  --use-scheme https \

  --use-user-agent healthcheckr/example;

# get help on available flags

healthcheckr verify http --help;

```

## Worker mode

`healthcheckr` can also run as a long-running worker process with multiple healthchecks configured.

```sh

# with defaults

healthcheckr start worker;

# with all available flags

healthcheckr start worker \

  --config-path /path/to/config/file.yaml \

  --server-addr 0.0.0.0 \

  --server-port 8080;

# get help on flags

healthcheckr start worker --help;

```

Configuration is done via YAML:

```yaml

http:

  - scheme: https

    hostname: google.com

    path: /

    queries:

      - aaa=bbb

      - bbb=ccc

    method: get

    userAgent: healthcheckr/1.0/example

    timeoutMs: 3000

    expectStatusCode: 200

    expectBodyRegexes:

      - Google

    intervalMs: 5000

    failureThreshold: 3

    channels:

      - telegram-default

channels:

  - name: telegram-default

    type: telegram

    apiKey:

      fromEnv: TELEGRAM_BOT_TOKEN

    chatId:

      value: "123456789"

  - name: slack-default

    type: slack

    url:

      fromEnv: SLACK_WEBHOOK_URL

```

An example is available at [`examples/config.yaml`](examples/config.yaml).

### Worker configuration documentation

#### Configuration properties

| Property | Type | Description |

| --- | --- | --- |

| `http[]` | `list(object)` | `http` is the root level property that defines a list of HTTP-based checks. |

| `http[].scheme` | `string` | This defines the scheme to use for the TCP connection. Only `http` and `https` is supported. |

| `http[].hostname` | `string` | This defines the hostname component of the URL. |

| `http[].path` | `string` | This defines the path component of the URL. Defaults to `"/"` |

| `http[].queries` | `list(string)` | This defines a list of `key=value` strings which are used as the query value of the HTTP-based request |

| `http[].method` | `string` | This defines the method to use for the request. Defaults to a `"GET"` request. |

| `http[].userAgent` | `string` | This defines a custom User Agent string to use for the request. Defaults to `"healthcheckr/1.0"` |

| `http[].timeoutMs` | `string` | This defines a timeout for the request in terms of milliseconds. Defaults to `5000` milliseconds. |

| `http[].expectStatusCode` | `integer` | This defines the expected integer status code of the response. Defaults to `200`. |

| `http[].expectBodyRegexes` | `list(string)` | This defines a list of regular expressions to match against the response body. |

| `http[].intervalMs` | `integer` | This defines the interval between checks in terms of milliseconds. Defaults to `5000` milliseconds |

| `http[].failureThreshold` | `integer` | This defines the maximum number of check failures before a notification is triggered to one of the defined channels. |

| `http[].alertMinimumIntervalS` | `integer` | This defines the minimum duration between alerts in terms of seconds. Assuming a value of `60`, this means failure notifications will happen only once every 60 seconds even if failures beyond the `.failureThreshold` |

| `http[].channels` | `list(string)` | This defines a list of channel names for which this HTTP check should notify upon failure/resolution. This string should be a value defined in one `channels[].name` otherwise an error will be thrown. |

| `channels[]` | `list(object)` | `channels` is a root level property defining a list of channels to which checks can send notifications via. |

| `channels[].name` | `string` | This defines the name of the channel and must be unique across all channels. |

| `channels[].type` | `string` | This defines the type of channel which affects how the `apiKey` and `chatId`  properties are consumed. Currently only `"telegram"` and `"slack"` are supported. See notes at the bottom of this section for instructions on what fields to define. |

| `channels[].apiKey` | `ChannelValue` | This defines the API key to use for this channel where applicable. |

| `channels[].apiKey.fromEnv` | `string` | This defines the environment variable from which to retrieve the value of the API key. |

| `channels[].apiKey.value` | `string` | This defines the literal value of the API key. Takes precedence over the `.fromEnv` property. |

| `channels[].chatId` | `ChannelValue` | This defines the chat ID to use for this channel where applicable. |

| `channels[].chatId.fromEnv` | `string` | This defines the environment variable from which to retrieve the value of the chat ID. |

| `channels[].chatId.value` | `string` | This defines the literal value of the chat ID. Takes precedence over the `.fromEnv` property. |

| `channels[].url` | `ChannelValue` | This defines the URL to use to send notifications to where applicable. |

| `channels[].url.fromEnv` | `string` | This defines the environment variable from which to retrieve the value of the URL. |

| `channels[].url.value` | `string` | This defines the literal value of the URL. Takes precedence over the `.fromEnv` property. |

#### Channel type configuration

When `"telegram"` is used for `channels[].type`:

- Set the `apiKey` to the Bot Token by @BotFather

- Set the `chatId` to the ID of the chat which notifications should be sent to

When `"slack"` is used:

- Set the `url` property to the webhook URL

## Debug mode

### Telegram chat ID retrieval

To use a Telegram channel for alerting:

1. Create a new bot with [@BotFather](https://t.me/BotFather) and receive a Telegram bot token (`${BOT_TOKEN}` from here). Set this in your terminal by running `export BOT_TOKEN=${BOT_TOKEN}` or define a `.envrc` file locally. For container deployments, specify these in the deployment manifest in the recommended secure way of your platform.

2. Start `healthcheckr` in Telegram debug mode while specifying the `${BOT_TOKEN}`:

    ```sh

    healthcheckr debug telegram --bot-token ${BOT_TOKEN};

    ```

3. Add the Telegram bot to a chat

4. Use `/info` to trigger a response that will indicate the chat ID (`${CHAT_ID}` from here)

5. Start `healthcheckr` specifying the Telegram chat ID and Telegram bot token as one of the channels in the configuration file

    ```yaml

    # ... other properties ...

    channels:

      # ... other channels ...

      - name: telegram-01

        type: telegram

        apiKey:

          fromEnv: BOT_TOKEN

        chatId: ${CHAT_ID}

    # ... other properties ...

    ```

6. Use the channel by specifying its name in the check configuration:

    ```yaml

    # ... other properties ...

    http:

      - scheme: https

        hostname: google.com

        # ... other properties ...

        channels:

          - telegram-01

    # ... other properties ...

    ```

### Utility HTTP server

A debug/utility HTTP server is included as part of the tool. To execute it, run:

```sh

healthcheckr debug http;

```

A server should start on port `8080`. This server returns either a `200` or `500` status code depending on whether the failure mode is turned on. To switch the mode, do a `curl` to the `/mode` endpoint:

```sh

curl localhost:8080/mode;

```

You should observe one of the following logs:

```sh

INFO[3886] fail mode toggled, / will now return 500

INFO[3931] fail mode toggled, / will now return 200

```

The configuration file at [`./examples/config-local.yaml`](./examples/config-local.yaml) uses this endpoint and can be used to verify the notification workflow.

# Development

## Getting started

Run the following to do a smoke test on Google:

```sh

go run . verify http;

```

To test the worker mode locally, you'll need to:

1. Create a Telegram bot (ask @BotFather)

2. Create a Slack webhook (add the app in Slack)

3. Start the utility HTTP server (instructions above)

Then create a `.envrc` file and define the following:

```sh

export TELEGRAM_BOT_TOKEN=xxx

export TELEGRAM_CHAT_ID=xxx

export SLACK_WEBHOOK_URL=xxx

```

And then run:

```sh

go run . start worker -c ./examples/config-local.yaml;

```

You can toggle the mode of the utility HTTP server to observe the notifications workflow:

```sh

curl localhost:8080/mode;

```

## Executing via `go`

To execute locally without compiling, replace all `healthcheckr` invocations with `go run .`

# License

This tool is licensed under the permissive MIT license.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zephinzer/healthcheckr

Awesome Lists containing this project

README