Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/awcodify/awesome-monitoring

This repository is a curated collection of valuable monitoring tools, resources, and best practices for developers, sysadmins, and DevOps professionals. It covers various aspects of monitoring, including infrastructure, applications, logs, networks, cloud, and Kubernetes.
https://github.com/awcodify/awesome-monitoring

List: awesome-monitoring

alerting devops infrastructure logging logs metrics monitoring sre sysadmin

Last synced: 14 days ago
JSON representation

This repository is a curated collection of valuable monitoring tools, resources, and best practices for developers, sysadmins, and DevOps professionals. It covers various aspects of monitoring, including infrastructure, applications, logs, networks, cloud, and Kubernetes.

Awesome Lists containing this project

README

        

# Awesome Monitoring

Welcome to the Awesome Monitoring repository! This collection aims to curate a list of valuable monitoring tools, resources, and best practices for developers, sysadmins, and DevOps professionals.

## Categories

- [Infrastructure Monitoring](#infrastructure-monitoring)
- [Application Monitoring](#application-monitoring)
- [Log Management](#log-management)
- [Network Monitoring](#network-monitoring)
- [Cloud Monitoring](#cloud-monitoring)
- [Kubernetes Monitoring](#kubernetes-monitoring)

## Infrastructure Monitoring

Infrastructure monitoring tools are designed to monitor the health and performance of servers, networks, and other critical components of an IT infrastructure. These tools help in identifying issues, analyzing resource utilization, and ensuring the overall stability and reliability of the infrastructure.

### Tools

- [Prometheus](https://prometheus.io/): An open-source monitoring and alerting toolkit designed for reliability and scalability. It provides time-series data collection, querying, and alerting capabilities, making it a popular choice for monitoring distributed systems.

- [Grafana](https://grafana.com/): A leading open-source analytics and monitoring platform. Grafana allows users to visualize and analyze data from various sources, including Prometheus, Elasticsearch, InfluxDB, and more.

### Resources

- [Prometheus Documentation](https://prometheus.io/docs/): The official documentation for Prometheus, providing in-depth guides on how to set up, configure, and use Prometheus for monitoring.

- [Grafana Tutorials](https://grafana.com/tutorials/): A collection of tutorials and guides for using Grafana effectively, covering various aspects of data visualization and analysis.

## Application Monitoring

Application monitoring tools focus on tracking the performance and health of applications and services. They help identify bottlenecks, errors, and potential improvements to ensure the smooth functioning of applications.

### Tools

- [Datadog](https://www.datadog.com/): A cloud monitoring platform that offers full-stack observability. Datadog provides comprehensive monitoring, metrics, traces, and logs for infrastructure, applications, and services.

- [New Relic](https://newrelic.com/): A monitoring and observability platform that provides real-time insights into the performance and health of applications, servers, and infrastructure.

### Resources

- [New Relic University](https://learn.newrelic.com/): An online learning platform with courses, tutorials, and documentation on using New Relic for application monitoring and performance management.

## Log Management

Log management tools centralize, analyze, and visualize logs generated by various applications and services. These tools help in monitoring system activities, troubleshooting issues, and maintaining compliance.

### Tools

- [ELK Stack](https://www.elastic.co/what-is/elk-stack): A combination of Elasticsearch, Logstash, and Kibana used for centralized log management and analysis. Elasticsearch stores and indexes logs, Logstash processes and ships logs, and Kibana provides a web interface for visualization and analysis.

- [Splunk](https://www.splunk.com/): A powerful log management and analysis platform that helps organizations gain insights from their machine data.

### Resources

- [ELK Stack Getting Started Guide](https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-elastic-stack.html): A step-by-step guide to getting started with the ELK Stack for log management.

- [Splunk Documentation](https://docs.splunk.com/Documentation/Splunk/latest/): The official documentation for Splunk, providing comprehensive guides and reference materials.

## Network Monitoring

Network monitoring tools focus on monitoring network devices, traffic, and performance to identify and resolve network-related issues. These tools help in maintaining network health and optimizing performance.

### Tools

- [Nagios](https://www.nagios.org/): A widely-used open-source monitoring system that offers comprehensive monitoring and alerting capabilities for servers, network devices, and applications.

- [PRTG Network Monitor](https://www.paessler.com/prtg): A powerful network monitoring tool with auto-discovery features, customizable dashboards, and extensive alerting options.

### Resources

- [Nagios Core Documentation](https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/quickstart.html): The quick start guide for Nagios Core, helping users get started with network monitoring.

- [PRTG Knowledge Base](https://kb.paessler.com/): The PRTG Knowledge Base with articles and guides on network monitoring best practices and troubleshooting.

## Cloud Monitoring

Cloud monitoring tools are designed to monitor the performance, usage, and costs of cloud resources and services. These tools help in optimizing cloud infrastructure and ensuring efficient resource utilization.

### Tools

- [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/): A monitoring service provided by AWS to monitor AWS resources and applications.

- [Google Cloud Monitoring](https://cloud.google.com/monitoring): A monitoring service provided by Google Cloud Platform to monitor the health and performance of applications and infrastructure.

### Resources

- [Amazon CloudWatch Documentation](https://docs.aws.amazon.com/cloudwatch/): The official documentation for Amazon CloudWatch, offering detailed guides and tutorials on using the service effectively.

- [Google Cloud Monitoring Documentation](https://cloud.google.com/monitoring/docs): The official documentation for Google Cloud Monitoring, providing in-depth information on monitoring GCP resources.

## Kubernetes Monitoring

Monitoring tools specifically designed for Kubernetes to monitor the health and performance of Kubernetes clusters, workloads, and infrastructure.

### Tools

- [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator): An operator for Kubernetes that simplifies the deployment and management of Prometheus instances. It allows users to define Prometheus configurations using custom resources.

- [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus): A comprehensive collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules that are bundled together to set up monitoring using Prometheus Operator.

- [Prometheus Adapter](https://github.com/kubernetes-sigs/prometheus-adapter): An add-on that allows custom metrics to be exposed in Kubernetes APIs, enabling Horizontal Pod Autoscaling based on custom metrics.

- [kube-thanos](https://github.com/thanos-io/kube-thanos): An extension to Prometheus that enables long-term storage and global querying capabilities using Thanos. It helps to address challenges related to Prometheus's short retention periods in Kubernetes.

### Resources

- [Prometheus Operator Documentation](https://github.com/prometheus-operator/prometheus-operator/tree/main/Documentation): Official documentation for Prometheus Operator, providing in-depth guides on setting up Prometheus in Kubernetes.

- [kube-prometheus Documentation](https://github.com/prometheus-operator/kube-prometheus/tree/main/docs): Documentation for kube-prometheus, offering detailed instructions on deploying monitoring components in Kubernetes.

- [Prometheus Adapter Documentation](https://github.com/kubernetes-sigs/prometheus-adapter/tree/main/docs): Documentation for Prometheus Adapter, helping users set up custom metrics and utilize them for autoscaling in Kubernetes.

- [kube-thanos README](https://github.com/thanos-io/kube-thanos#readme): Readme file for kube-thanos, providing an overview of the project and its usage.

## Contributing

If you have a suggestion for a new monitoring tool or resource, or you want to contribute to the existing list, please read the [contribution guidelines](CONTRIBUTING.md) first. We welcome and appreciate your contributions!

## License

This repository is open-source and available under the [LICENSE](LICENSE) terms. Please review the license before using or contributing to this repository.

Let's make monitoring awesome together!