Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zulily/aws_monitor

Lambda function which creates/refreshes instance-based monitors and dashboards
https://github.com/zulily/aws_monitor

Last synced: 3 months ago
JSON representation

Lambda function which creates/refreshes instance-based monitors and dashboards

Host: GitHub
URL: https://github.com/zulily/aws_monitor
Owner: zulily
License: apache-2.0
Created: 2017-10-03T19:25:26.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-08-30T21:46:10.000Z (over 5 years ago)
Last Synced: 2024-08-04T08:05:49.771Z (7 months ago)
Language: Python
Size: 28.3 KB
Stars: 3
Watchers: 6
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Dynamic Monitoring of AWS

Keeping track of your instances in AWS is quite a task.  This tool (zumoco, for zulily monitoring collector), run routinely from AWS Lambda, will detect your instances, add CloudWatch alerting for each, and create CloudWatch dashboards based on your preferences.

## Setup

There are a few things you need to do to set this up for yourself (see details below):  

 

  - Create the AWS Simple Notification Service (SNS) topics/subscriptions for notifying teams based on alert severity.

  - Create the AWS S3 bucket to store the instance history, per service.

  - Customize the JSON templates in the `monitordefs` directory to match the services you want to monitor.

  - Package and deploy the lambda function to AWS.

### Create AWS SNS topics/subscriptions

AWS uses SNS to handle notifications of AWS CloudWatch alerts. zumoco uses the Amazon Resource Name (ARN) for a given SNS topic (connected to a notification endpoint) when creating a CloudWatch alert.  Various SNS topic/subscriptions can be created as follows:

 - Create a PagerDuty integration for SNS following the process here: [link](https://www.pagerduty.com/docs/guides/aws-cloudwatch-integration-guide/)  (ARN is available following Step 4. of the "AWS SNS Console" section.)

 - Create a Slack integration for SNS by creating another AWS Lambda function, which pushes SNS events to the Slack Chat server. The description of the AWS Lambda template is found here: [link](https://aws.amazon.com/blogs/aws/new-slack-integration-blueprints-for-aws-lambda/).

 - Create a hipchat integration by creating another AWS Lambda function, as provided here: [link](https://github.com/zulily/aws_monitor/tree/master/sns_integrations).  Note: the `tests/zumoco_test.py` test file requires an SNS topic to create alarms; replace the appropriate `AlarmDestinations` values with your SNS topic in order to run the tests.

 - Create an email integration, using the AWS process here: [link](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/US_SetupSNS.html)

### Create AWS S3 Bucket

This lambda function uses an S3 bucket to store the history of instances, in order to only add/delete alarms for a specific instance. Create an S3 bucket with prefix named to match the value used in the `s3_access.json` file.

### Customize the JSON templates

* Edit `team.json` to modify:

 - `Team` : Set to your team's name.

 - `CreateTeamDashboard` : Set to false if you don't want a dashboard set with all metric alarms.

 - `Bucket` : Set to your S3 bucket name.

 - `MonitorDefs` list : Change to reference only the services' files on which you plan to alert.

* Copy each service you want to monitor (e.g., `ec2_TeamFoo.json`) to a new filename (referencing it in the `team.json` `MonitorDefs` list.)  In the new file, modify:

 - `S3Suffix` : Make this unique per service file (so if you have two ec2 service files, make sure this value is changed).

 - `ReportARN` : Set this to the SNS ARN used for receiving the service report, generated for each service on each run giving total/added/deleted service instances.

 - `InstanceFilters` : If your instances have names, add `tag:Name` key's values as you likely want to restrict monitoring (and dashboard generation) to a subset of instances (less than 1k total, per AWS API docs).  (If you do this, you also should to change the `AlarmPrefix` to keep dashboards, etc., separate.) Example:

	 - `"Filters=[{'Name':'tag:Name', 'Values':['hadoop*']},{'Name':'instance-state-name', 'Values':['running']}]"`

 - `AlarmDestinations` : Modify to include all SNS topic/subscription alarm destinations you created in the previous section.

 - `CreateServiceDashboard` : Set to false if you don't want a dashboard set with all metric alarms for the given service.

 - `AlarmPrefix` : Use this string to name all (filtered) instance alerts of the given service.

 - `Alarms` section:  (Add/remove/change [metrics](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CW_Support_For_AWS.html) in this dictionary as necessary).

	 - `AlarmAction` : Set this to the appropriate `AlarmDestination` name.

	 - `send_ok` : Set this boolean to `false` if you don't want `OKAction` messages sent to the same `AlarmDestination` as the `AlarmAction`.

	 -  `Period` : Adjust based on basic/detailed monitoring, etc.

	 -  `Threshold` : Adjust based on preferences.

	- `Charts` section:  (Add/remove/change [charts](http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/CloudWatch-Dashboard-Body-Structure.html) in this dictionary as necessary).

		- `ch_type` : "Metric" is currently supported for auto-generation.

		- `is_alarm` : Boolean determining whether chart is an alarm chart (requiring ARN from an alarm) or a metrics chart without alarm values shown.

		- `avail` : String used to select availability zone in instance JSON (varies by service instance).

		- `metric_list` : List of metrics or alarm to be charted, following [charts](http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/CloudWatch-Dashboard-Body-Structure.html) definition.

		- `view` : Note: `singleValue` charts are half the width of `timeSeries` charts.

	- `TagsKey` section makes charts/alarms easier to read by adding/substituting tags for instance id:

		- `FriendlyName` : Set to tag key used as name for the given instance, or `null` to only use instance id.

		- `EnsureUniqueName` : Set to `true` to append instance id to FriendlyName to ensure uniqueness. This is useful for EC2 when you are using autoscaling groups that have the same `Name` value, etc.

The packaging step will deploy everything in the monitordefs directory to the Lambda zip file (so you may wish to remove templates/files you don't use).

	

	

### Package and deploy the lambda function

1. Edit `vars.sh` to set the rate for the lambda function to run inside your VPC.

1. Run `./deploy_lambda_function.sh`.  This will:

* Package up the script with its dependencies into the zip format that AWS Lambda expects (as defined in `package.sh`).

* Interact with the AWS API to set up the lambda function with the things it needs (as defined in `deployscripts/setup_lambda.py`):

  * Creates an IAM role for the lambda function to use.  Review the json files in the `deployscripts` directory to see the permissions 

  required.

  * Uploads the zip file from the previous step to create a Lambda function (possibly publishing a new version if the function 

  already exists).