Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jbarratt/prometheus_sitemon
Example of using Prometheus to monitor website uptime
https://github.com/jbarratt/prometheus_sitemon
Last synced: 3 months ago
JSON representation
Example of using Prometheus to monitor website uptime
- Host: GitHub
- URL: https://github.com/jbarratt/prometheus_sitemon
- Owner: jbarratt
- License: mit
- Created: 2016-11-19T00:58:11.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-02-14T17:25:33.000Z (over 6 years ago)
- Last Synced: 2024-07-14T12:38:30.123Z (4 months ago)
- Language: Go
- Size: 85.9 KB
- Stars: 77
- Watchers: 3
- Forks: 18
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-starred - jbarratt/prometheus_sitemon - Example of using Prometheus to monitor website uptime (others)
README
# Prometheus for Website monitoring
Simple example of using prometheus to track website uptime.
[Prometheus](https://prometheus.io) is built in a modular, "microservice" like way.
This example runs some small docker containers, using docker-compose to wire them together. First, the "real" parts of the stack:
* The **prometheus engine** itself: Manages the state of all monitorables (in this case, the list of domains we care about monitoring)
* A process called **blackbox-exporter** which prometheus polls to actually execute the health checks
* An **Alertmanager**, which handles sending and managing state for alerts.Then there are 3 small app containers that provide a simulation framework:
* **alertlogger**: Handles webhook-based alerts from Alertmanager and logs them to a file (`data/alertlogger/alerts.log`)
* **flakyhost.com**: A web server configured to intermittently fail then come back, so we can see the down/up alerting
* **reliablehost.com**: A web server which (tries) to always be reliableTo play with this, if you want to also probe some real sites, you can edit the `config/blackbox_target.yml` file and add actual domains as well.
Then, make sure you have [docker-compose](https://docs.docker.com/compose/) (and docker) installed and run
>>> This builds the containers for the simulation framework
$ docker-compose build>>> start all the containers. Run without the `-d` if you want to see container logs.
$ docker-compose up -d>>> keep an eye on the logs coming out over the alertmanager
$ tail -f data/alertlogger/alerts.logThen go to http://localhost:9090/alerts in your browser to see what, if any hosts are alerting.
2016/11/19 15:30:57 Request from 172.18.0.6:54166: POST /
2016/11/19 15:30:57 {"receiver":"default-receiver","status":"resolved","alerts":[{"status":"resolved","labels":{"alertname":"SiteDown","instance":"flakyhost.com","job":"blackbox"},"annotations":{"description":"site down: flakyhost.com","summary":"site down: flakyhost.com"},"startsAt":"2016-11-19T15:28:27.818Z","endsAt":"2016-11-19T15:29:27.818Z","generatorURL":"http://b873f429a190:9090/graph?g0.expr=probe_success+%3C+1\u0026g0.tab=0"}],"groupLabels":{"alertname":"SiteDown"},"commonLabels":{"alertname":"SiteDown","instance":"flakyhost.com","job":"blackbox"},"commonAnnotations":{"description":"site down: flakyhost.com","summary":"site down: flakyhost.com"},"externalURL":"http://438350b8d0ba:9093","version":"3","groupKey":15335440397915075285}
2016/11/19 15:30:57 site down: flakyhost.com
2016/11/19 15:30:57 Status: resolved2016/11/19 15:31:57 Request from 172.18.0.6:54216: POST /
2016/11/19 15:31:57 {"receiver":"default-receiver","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"SiteDown","instance":"flakyhost.com","job":"blackbox"},"annotations":{"description":"site down: flakyhost.com","summary":"site down: flakyhost.com"},"startsAt":"2016-11-19T15:31:27.818Z","endsAt":"0001-01-01T00:00:00Z","generatorURL":"http://b873f429a190:9090/graph?g0.expr=probe_success+%3C+1\u0026g0.tab=0"}],"groupLabels":{"alertname":"SiteDown"},"commonLabels":{"alertname":"SiteDown","instance":"flakyhost.com","job":"blackbox"},"commonAnnotations":{"description":"site down: flakyhost.com","summary":"site down: flakyhost.com"},"externalURL":"http://438350b8d0ba:9093","version":"3","groupKey":15335440397915075285}
2016/11/19 15:31:57 site down: flakyhost.com
2016/11/19 15:31:57 Status: firingYou can also see the other metrics that are tracked.
* Go to http://localhost:9090/graph
* Type `probe_` then another name (`probe_duration_seconds` is an interesting one to see performance over time.)![response_time_graph](PrometheusGraph.png)
These metrics could easily be added to a Grafana dashboard, as it has excellent Prometheus support.
For production use:
* Prometheus and the blackbox exporter can be run in multiple hosts (and/or multiple data centers)
* Alert manager can be run highly availably (they communicate with each other over a mesh protocol to block duplicate alerts)
* You can run Grafana or other dashboards and see other information (like response time, etc)
* Instead of a static `config/blackbox_targets.yml`, a second container could be run to programatically fetch those lists from an external source, such as a database or external API, and update the file. (The contents are dynamically reloaded within 30 seconds as needed.)
* Other types of probes (beyond HTTP) can be configured, the blackbox_exporter is hugely versatile.For full documentation see
* [Prometheus](https://prometheus.io/)
* [Blackbox Exporter](https://github.com/prometheus/blackbox_exporter)