Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/polarsignals/kubezonnet
Monitor cross-zone network traffic in Kubernetes.
https://github.com/polarsignals/kubezonnet
ebpf gke kubernetes metrics networking prometheus
Last synced: about 4 hours ago
JSON representation
Monitor cross-zone network traffic in Kubernetes.
- Host: GitHub
- URL: https://github.com/polarsignals/kubezonnet
- Owner: polarsignals
- License: apache-2.0
- Created: 2024-12-06T15:14:41.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-01-09T14:03:13.000Z (1 day ago)
- Last Synced: 2025-01-09T14:40:33.931Z (1 day ago)
- Topics: ebpf, gke, kubernetes, metrics, networking, prometheus
- Language: C
- Homepage:
- Size: 601 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# kubezonnet
**KUBE**rnetes cross-**ZON**e **NET**work monitoring with Prometheus for Cilium-based clusters (in Legacy host routing mode).
## Why?
Same-zone traffic is free on cloud providers, cross-zone traffic is not, and can easily become a major cost factor if a lot of data is moved over the network. Therefore, understanding which workloads are causing cross-zone traffic is vital.
## Deploy
Kubezonnet is designed to be deployed on Kubernetes, so nothing special is required, just apply the manifests:
```bash
kubectl apply -f https://raw.githubusercontent.com/polarsignals/kubezonnet/refs/heads/main/deploy/kubezonnet.yaml
```Container images are published at:
* agent: `ghcr.io/polarsignals/kubezonnet-agent`
* server: `ghcr.io/polarsignals/kubezonnet-server`## Requirements
* Cilium as the CNI (in Legacy host routing mode, otherwise netfilter won't work correctly, GKE dataplane v2 clusters use this mode)
* Linux Kernel 6.4+ (netfilter eBPF programs were only added in 6.4)## How does it work?
Kubezonnet is made up of two components:
* kubezonnet-agent: collects traffic statistics, using eBPF with a netfilter postrouting hook, about all Pod network traffic and sends the statistics to the server. This component is
deployed on all nodes. It aggregates the statistics per source and destination IP and sends them to the server every 10 seconds.
* kubezonnet-server: aggregates the statistics sent from the agents and resolves the actual pod, node and zone relationships of the network statistics, and then exposes the statistic
s on a Prometheus metrics endpoint. This component can either be deployed once per cluster, or for each zone (once again to save cross-zone traffic).## How do I use it?
### Metrics
The server portion of kubezonnet exposes a Prometheus metrics endpoint on port 8080, which can be scraped by Prometheus. Once set up the `pod_cross_zone_network_traffic_bytes_total`
counter will be available.This will show the top 20 pods by cross-zone network traffic per second in the last 5 minutes, in megabytes.
```promql
topk(20, rate(pod_cross_zone_network_traffic_bytes_total[5m])) / 1e6
```When trying to understand a cloud bill, the cumulative amount over a timeframe may be more interesting than the current usage. This query will show the top 20 pods by cross-zone netw
ork traffic in the last week, in gigabytes:```promql
topk(20, increase(pod_cross_zone_network_traffic_bytes_total[1w])) / 1e9
```### Logs
The server also logs something akin to flow logs, which can be used to understand the network traffic in more detail. They print the source and destination pods in addition to the ne
twork traffic associated whenever agents send statistics (every 10 seconds).## Limitations
* Currently only supports IPv4.
* Traffic statistics use the IP packet sizes, therefore skip the IP header part. It's recommended to use these statistics to understand ratios of traffic and not use it for metering purposes or comparing them to other lower level network statistics that include the IP header.## Roapmap
* Support for IPv6.
* Sum metrics by workload (deployment, statefulset, etc.), since pod granularity is not necessary to get the same insights and when higher granularity is needed, the logs can be used
.## Acknowledgments
Various people have helped in the process of putting some of the pieces of this project together. In no particular order that includes, but is not limited to:
* [Dylan Reimerink](https://github.com/dylandreimerink)
* [Casey Callendrello](https://github.com/squeed)
* [Chance Zibolski](https://github.com/chancez)
* [Florian Lehner](https://github.com/florianl)