Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jrxFive/consulalerting

Alert groups, teams, individuals by tags and plugins
https://github.com/jrxFive/consulalerting

Last synced: 2 months ago
JSON representation

Alert groups, teams, individuals by tags and plugins

Awesome Lists containing this project

README

        

[![Build Status](https://travis-ci.org/jrxFive/consulalerting.svg?branch=master)](https://travis-ci.org/jrxFive/consulalerting)
[![Coverage Status](https://coveralls.io/repos/jrxFive/consulalerting/badge.svg?branch=master&service=github)](https://coveralls.io/github/jrxFive/consulalerting?branch=master)

#Consul Alerting

[![Join the chat at https://gitter.im/jrxFive/consulalerting](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/jrxFive/consulalerting?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
A set of python files for Consul for checks, watches, and notifications. By using tags for services and checks,
consulalerting will notify the corresponding groups by whichever plugins are also in the tags list. For example
the redis service has plugins enabled for "hipchat" and will route notifications via hipchat to "devops" and "dev".
These routes are defined in the Consul KV under alerting/notify/, and can be setup using ConsulAlertingKVBoostrap.py, Consul KV, or programatically.

#High Availability
Consulalerting is not a separate daemon, each time a watch is triggered Consul itself will trigger the WatchCheckHandler. To ensure notifications are
received even if the local instance of the consul server is down, other instances will still notify. This is done by using Consul's session locking
feature with a TTL so in case consulalerting fails that specific hash of the catalog will continue to get reported on in the future.
By installing consulalerting on each Consul server and registering the watch, the first consulalerting instance to acquire a lock for the
current catalog md5sum hash will process the corresponding notifications. As long as Consul servers itself are not in a failed state consulalerting
will continue to notify.

# Using Tags to notify

```javascript
{
"service": {
"name": "redis",
"tags": ["devops","master","hipchat","dev"],
"port": 8000,
"checks": [{
"script": "/usr/local/bin/check_redis.py",
"interval": "10s"
}]
}
}
```

# Install steps

```bash
git clone https://github.com/jrxFive/consulalerting
cp -r consulalerting/consulalerting
# edit ConsulAlertingKVBoostrap.py or setup manually on ConsulKV interface
python consulalerting/ConsulAlertingKVBoostrap.py
```

After installing consulalerting in a directory of your choosing, use/edit ConsulAlertingKVBootstrap.py to ensure
the scripts can obtain the necessary KV information from Consul.

## Example ConsulAlertingKVBoostrap.py
```python
blacklist_nodes = ["fqdn","other_fqdn"]
blacklist_services = ["redis"]
blacklist_checks = ["service:redis"]

health_check_tags = ["devops","hipchat","techops"]

notify_hipchat= {"api_token":"",
"url":"https://api.hipchat.com/v1/rooms/message",
"rooms":{"devops":3},
{"techops":4},
}

notify_slack= {"api_token":"",
"rooms":{"techops":"#techops"}}

notify_mailgun= {"api_token":"",
"mailgun_domain":"",
"from": "[email protected]",
"teams":{"devops":["[email protected]","[email protected]"],
"qa": "[email protected]"}
}

notify_email= {"mail_domain_address":"email.domain.com",
"username":"",
"password":"",
"from": "[email protected]",
"teams":{"devops":["[email protected]","[email protected]"],
"qa": "[email protected]"}
}

notify_pagerduty = {"teams":{
"devops":""
}
}
}

notify_influxdb= {"url":"http://localhost:8086/write",
"series":"test",
"databases":{"db":"mydb"}
}

notify_elasticsearchlog = {"logpaths": ["/path/to/log1"]}

notify_cachet = {"api_token": "tokenFromCachetUserProfile",
"site_url": "http://status.company.com",
"notify_subscribers": False
}
```

| Variable Name | Type | Description |
| ------------- |------------- | ----- |
| blacklist_nodes | List | Consul agents are not to notify of state changes, by "Node" name in /v1/health/node/ |
| blacklist_services | List | Consul agents are not to notify of particular services, by "ServiceName" in /v1/health/node/ |
| blacklist_checks | List | Consul agents are not to notify based on checks, by "CheckID" in /v1/health/node/ |
| health_check_tags | List | Tags to be used to determine who to alert to and what type of alerts for non-application checks |

## Wildcard Blacklist
If you wish to disable all notification for a certain blacklist type simply use ["*"] as the blacklist array value.

After the script is run, you can always change these within the Consul UI

## Consul Watch Check Handler Setup
```javascript
{
"watches": [
{
"type": "checks",
"handler": "/consulalerting/WatchCheckHandler.py >> "
}
]
}
```

# Plugins

### Hipchat

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| api_token | string | Hipchat requires an auth_token |
| url | string | URL address of API access for corresponding token |
| rooms | dict | Create dictionaries within 'rooms' for tags corresponding to hipchat rooms |

### Slack

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| api_token | string | Slack requires an auth_token |
| rooms | dict | Create dictionaries within 'rooms' for tags corresponding to slack channels |

### Mailgun

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| api_token | string | Mailgun requires an auth_token |
| mailgun_domain | string | Mailgun domain address |
| from | string | From address when receiving an email |
| teams | dict | Create dictionaries within 'teams' for tags corresponding to teams or individuals |

### Email

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| mail_domain_address | string | Email SMTP server to route alert |
| username | string | If the email SMTP server requires authentication |
| password | string | If the email SMTP server requires authentication |
| from | string | From address when receiving an email |
| teams | dict | Create dictionaries within 'teams' for tags corresponding to teams or individuals |

### Pagerduty

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| teams | dict | Create dictionaries within 'teams' for tags corresponding to pagerduty teams, value is service_key |

### InfluxDB 0.9

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| url | string | URL address of API access for the database |
| series | string | Name of the database series that will contain the data |
| databases | dict | Create dictionaries within 'databases' for tags corresponding to influxdb databases |

### Elasticsearch Log

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| logpath | array of strings | Absolute path(s) of logfile to write in elasticsearch format |

### Cachet

| Keyname | Type | Description |
| ------- | ---- | ----------- |
| api_token | string | The API token provided by Cachet via a user profile page |
| site_url | string | The url of the Cachet instance |
| notify_subscribers | boolean | Whether or not subscribers should be notified of the incident |

__NOTE:__ In order for this plugin to report Cachet incidents to specific components, it is expected that in addition to the `cachet` tag you also provide a "service nice name" as a tag. For example, if in Cachet your component is called "Data Import Service" you would then provided that same string as a tag in your service definition. If a matching tag is not found incidents will be reported with a generic name of "Consul State Change."

# TODO
* ~~HA, install per leader, using locks and md5sums of state~~
* ~~Plugin Separation~~
* ~~Settings as an import instead of inherited~~
* Cleanup method documentation
* Influxdb 0.8/~~0.9~~ and logstash protocol plugins
* ~~Wildcard blacklist~~
* Improve KVBootstrap.py
* Integration tests
* Improve code coverage
* Use STDIN of catalog instead of lookup