{"id":27294559,"url":"https://github.com/nerdalert/kube-bandwidth","last_synced_at":"2026-02-04T06:04:28.512Z","repository":{"id":146439261,"uuid":"474203519","full_name":"nerdalert/kube-bandwidth","owner":"nerdalert","description":"Measure and Benchmark Multi-Node Kubernetes Bandwidth","archived":false,"fork":false,"pushed_at":"2022-03-26T01:11:01.000Z","size":3511,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-10T23:10:41.793Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nerdalert.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-03-26T00:41:20.000Z","updated_at":"2022-03-26T00:41:20.000Z","dependencies_parsed_at":"2023-03-30T11:07:04.157Z","dependency_job_id":null,"html_url":"https://github.com/nerdalert/kube-bandwidth","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nerdalert/kube-bandwidth","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdalert%2Fkube-bandwidth","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdalert%2Fkube-bandwidth/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdalert%2Fkube-bandwidth/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdalert%2Fkube-bandwidth/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nerdalert","download_url":"https://codeload.github.com/nerdalert/kube-bandwidth/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdalert%2Fkube-bandwidth/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29072481,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-04T03:31:03.593Z","status":"ssl_error","status_checked_at":"2026-02-04T03:29:50.742Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-11T22:53:24.500Z","updated_at":"2026-02-04T06:04:28.499Z","avatar_url":"https://github.com/nerdalert.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"### [WIP] Kubernetes Deployment\n\nIn a Kubernetes environment, one can leverage the K8s control plane to setup\ncollectors and pollers to simplify deployment and managing the lifecycle. For the\nmost part, daaemonsets are an attractive route to deploy pollers and listeners since\nthe pods get deployed on all nodes by default.\n\n- The base project that this deployment expands upon is [Cloud Bandwidth - Bandwidth Performance Monitoring](https://github.com/nerdalert/cloud-bandwidth)\n\n### Scenario 1 - Aggregate Collector and Distributed Pollers\n\nThere are some drawbacks with daemonsets, particularly with regard to Iperf since it \nis not capable of running concurrent tests, only one at a time. Daemonsets will deploy \nthe same workload at once so that means all pods would try and initiate a bandwidth \ntest at once and only one node would get through while the rest would receive busy \nsignals. This can be juggled, but requires more control plane interactions to\nensure there is a lock on one client running at a time.\n\nNetperf becomes very attractive for this aggregate scenario where we want to collect \nmeasurements at potential choke points, such as gateways and proxies. Netperf supports \nconcurrent bandwidth tests and will multi-thread the multiple streams. Netperf is also very\nuseful for capturing the CPU load associated with the network i/o.\n\n- As shown in the diagram, multiple pollers will connect to the single netperf instance\nand measure the bandwdith to that node. Since it is a daemonset, all nodes will run the\ntest concurrently so you will get an aggregate reading in the data visualized in grafana.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/images/aggregate-collector.png\" alt=\"drawing\" width=\"65%\"/\u003e\n\u003c/p\u003e\n\nTo deploy this scenario, simply start a netperf server where you want the aggregate data to\nfunnel into and spin up a tsdb/grafana instance.\n\n- Start the Netserver collector (netserver is the server side and netperf is the client side)\n\n```shell\n$ docker run  -itd --rm --name=netserver -p 12865:12865 networkstatic/netserver -D\n# or use podman and/or quay.io\n$ podman run  -itd --rm --name=netserver -p 12865:12865 quay.io/networkstatic/netserver -D\n# or outside of a container with\n$ netserver -D\n```\n\n- Start the TSDB/Grafana instance. You can use [grafana-template-netperf-aggregate.json](grafana-template-netperf-aggregate.json)\nand import it into Grafana to get started with a demo.\n\n\n```sh\ndocker run -d \\\n   --name graphite-grafana \\\n   --restart=always \\\n   -p 80:80 \\\n   -p 2003-2004:2003-2004 \\\n   quay.io/networkstatic/graphite-grafana\n \n # or using Podman\n sudo sysctl net.ipv4.ip_unprivileged_port_start=80\n podman run -d \\\n    --name graphite-grafana \\\n    --restart=always \\\n    -p 80:80 \\\n    -p 2003-2004:2003-2004 \\\n   quay.io/networkstatic/graphite-grafana\n```\n\nThe default daemonset configuration is set to poll for 5 seconds every 5 minutes.\nKeep in mind this will generate traffic on your network and if it was an extremely large cluster\non an already saturated network, you could create latency during that 5 second traffic burst.\nTLDR; start small in a lab! \n\n- The ENV variables in the daemonset configuration need to be adjusted\nadd in your setup, particularly plugging in your IP or DNS addresses for \n`CBANDWIDTH_PERF_SERVERS` and `CBANDWIDTH_GRAFANA_ADDRESS`. In the configuration\nbelow, both the netserver instance and grafana instance are on `192.168.122.1`. \nThat will almost certainly be different in your environment. See \n[quickstart demo](https://github.com/nerdalert/cloud-bandwidth#quickstart-demo) for\nmore setup information.\n- Example daemonset file listed below [cloud-bandwidth-netperf-ds.yaml](cloud-bandwidth-netperf-ds.yaml)\n\n```yaml\n# kubectl delete daemonset iperf -n kube-system; kubectl apply -f iperf-ds-exit.yaml\napiVersion: apps/v1\nkind: DaemonSet\nmetadata:\n  name: cloud-bandwidth-netperf-ds\n  namespace: kube-system\nspec:\n  selector:\n    matchLabels:\n      name: netperf-client\n  template:\n    metadata:\n      labels:\n        name: netperf-client\n    spec:\n      hostNetwork: true\n      hostIPC: true\n      hostPID: true\n      containers:\n        - name: cloud-bandwidth-netperf\n          image: quay.io/networkstatic/cloud-bandwidth\n          env:\n            - name: CBANDWIDTH_PERF_SERVERS\n              value: \"192.168.122.1\"\n            - name: CBANDWIDTH_PERF_SERVER_PORT\n              value: \"2003\"\n            - name: CBANDWIDTH_POLL_INTERVAL\n              value: \"20\"\n            - name: CBANDWIDTH_POLL_LENGTH\n              value: \"3\"\n            - name: CBANDWIDTH_DOWNLOAD_PREFIX\n              value: \"bandwidth.netperf\"\n            - name: CBANDWIDTH_GRAFANA_ADDRESS\n              value: \"192.168.122.1\"\n            - name: CBANDWIDTH_GRAFANA_PORT\n              value: \"2003\"\n            - name: NODE_NAME\n              valueFrom:\n                fieldRef:\n                  fieldPath: spec.nodeName\n          securityContext:\n            allowPrivilegeEscalation: true\n            privileged: true\n          command: [\"/bin/sh\",\"-c\"]\n          args:\n          - |\n            while true; do\n              ./cloud-bandwidth -perf-servers $CBANDWIDTH_PERF_SERVERS:$NODE_NAME \\\n                -perf-server-port $CBANDWIDTH_PERF_SERVER_PORT \\\n                -test-interval $CBANDWIDTH_POLL_INTERVAL \\\n                -test-length $CBANDWIDTH_POLL_LENGTH \\\n                -tsdb-download-prefix $CBANDWIDTH_DOWNLOAD_PREFIX \\\n                -grafana-address $CBANDWIDTH_GRAFANA_ADDRESS \\\n                -grafana-port $CBANDWIDTH_GRAFANA_PORT \\\n                -netperf \\\n                -nocontainer \\\n                -debug;\n            done\n      terminationGracePeriodSeconds: 60\n```\n\nOnce you have added your IP addresses or any other configuration parameter run the daemonset\n\n```shell\n# daemonsets are always created in the kube-system namespace\nkubectl apply -f \u003cfile-containing-the-above-yaml\u003e.yaml\n\n# to delete the daemonset and associated pods, run:\nkubectl delete daemonset cloud-bandwidth-netperf-ds -n kube-system\n```\n\nViewing the logs on any node should result in output like the following:\n\n```shell\n$ kubectl logs cloud-bandwidth-netperf-ds-2w6c9 -n kube-system\ntime=\"2022-03-15T04:04:30Z\" level=info msg=\"no configuration file found, defaulting to command line arguments\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"Configuration as follows:\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Grafana Server = 192.168.122.1:2003\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Test Interval = 300sec\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Test Length = 5sec\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] TSDB download prefix = bandwidth.netperf\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] TSDB upload prefix = bandwidth.upload\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Perf Server = 192.168.122.1:cluster-d\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Perf Binary = netperf\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[Config] Perf Server Port = 12865\"\ntime=\"2022-03-15T04:04:30Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c netperf -P 0 -t TCP_STREAM -f k -l 3 -p 12865 -H 192.168.122.1 | awk '{print $5}']\"\ntime=\"2022-03-15T04:04:33Z\" level=info msg=\"Download results for endpoint 192.168.122.1 [cluster-d] -\u003e 8594749000 bps\"\ntime=\"2022-03-15T04:04:33Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.netperf.cluster-d 8594749000 1647317073\\n\"\ntime=\"2022-03-15T04:04:53Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c netperf -P 0 -t TCP_STREAM -f k -l 3 -p 12865 -H 192.168.122.1 | awk '{print $5}']\"\ntime=\"2022-03-15T04:04:56Z\" level=info msg=\"Download results for endpoint 192.168.122.1 [cluster-d] -\u003e 8482690000 bps\"\ntime=\"2022-03-15T04:04:56Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.netperf.cluster-d 8482690000 1647317096\\n\"\ntime=\"2022-03-15T04:05:16Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c netperf -P 0 -t TCP_STREAM -f k -l 3 -p 12865 -H 192.168.122.1 | awk '{print $5}']\"\ntime=\"2022-03-15T04:05:19Z\" level=info msg=\"Download results for endpoint 192.168.122.1 [cluster-d] -\u003e 8075824000 bps\"\n```\n- Drill into the data as described in the quickstart for setting up how to view each node.\nIn the grafana template the node names are `cluster-a`, `cluster-b`, `cluster-c` etc. The output\nin grafana looks like so:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/images/distributed-collector-grafana.png\" alt=\"drawing\" width=\"65%\"/\u003e\n\u003c/p\u003e\n\n### Scenario 2 - Distributed Collectors\n\nIn this scenario, rather than deploying a number of pollers to all nodes in the cluster, we\ndeploy multiple collectors. This avoids the concurrency issue of Iperf if you only deploy one or a few\npollers and limit the odds of a test collision.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/images/distributed-collector.png\" alt=\"drawing\" width=\"65%\"/\u003e\n\u003c/p\u003e\n\n- Iperf server daemonset - Example daemonset yaml file listed below [cloud-bandwidth-iperf-server-ds.yaml](cloud-bandwidth-iperf-server-ds.yaml)\n\n```yaml\n# todo: add description\napiVersion: apps/v1\nkind: DaemonSet\nmetadata:\n  name: cloud-bandwidth-iperf-server-ds\n  namespace: kube-system\nspec:\n  selector:\n    matchLabels:\n      name: iperf-server\n  template:\n    metadata:\n      labels:\n        name: iperf-server\n    spec:\n#      Uncomment the following fields to use host networking.\n#      Host networking will keep more static addresses\n#      but also takes a different network path both\n#      internally to the node and externally inter-node\n#      hostNetwork: true\n#      hostIPC: true\n#      hostPID: true\n      containers:\n      - name: iperf3\n        image: quay.io/networkstatic/iperf3\n        args: [\"-s\"]\n        ports:\n        - containerPort: 5201\n      terminationGracePeriodSeconds: 60\n```\n\n- Cloud Bandwidth container poller deployment: - Example deployment yaml file listed below [cloud-bandwidth-poller-deployment.yaml](cloud-bandwidth-poller-deployment.yaml)\n\n```yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: cloud-bandwidth-poller\n  labels:\n    app: cloud-bandwidth-poller\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: cloud-bandwidth-poller\n  template:\n    metadata:\n      labels:\n        app: cloud-bandwidth-poller\n    spec:\n      containers:\n        - name: cloud-bandwidth\n          image: quay.io/networkstatic/cloud-bandwidth\n          env:\n            - name: CBANDWIDTH_PERF_SERVERS  # Fill in the servers you want to poll in CBANDWIDTH_PERF_SERVERS \"IP:name\" or \"DNS:name\"\n              value: \"10.42.0.12:cluster-a,10.42.1.19:cluster-b,10.42.2.3:cluster-c,10.42.3.3:cluster-d\"\n            - name: CBANDWIDTH_PERF_SERVER_PORT\n              value: \"5201\"\n            - name: CBANDWIDTH_POLL_INTERVAL\n              value: \"300\"\n            - name: CBANDWIDTH_POLL_LENGTH\n              value: \"5\"\n            - name: CBANDWIDTH_DOWNLOAD_PREFIX\n              value: \"bandwidth.download\"\n            - name: CBANDWIDTH_UPLOAD_PREFIX\n              value: \"bandwidth.upload\"\n            - name: CBANDWIDTH_GRAFANA_ADDRESS\n              value: \"192.168.122.1\"\n            - name: CBANDWIDTH_GRAFANA_PORT\n              value: \"2003\"\n          securityContext:\n            allowPrivilegeEscalation: true\n            privileged: true\n          command: [\"/bin/sh\",\"-c\"]\n          args:\n            - |\n              while true; do\n                ./cloud-bandwidth -perf-servers $CBANDWIDTH_PERF_SERVERS \\\n                  -perf-server-port $CBANDWIDTH_PERF_SERVER_PORT \\\n                  -test-interval $CBANDWIDTH_POLL_INTERVAL \\\n                  -test-length $CBANDWIDTH_POLL_LENGTH \\\n                  -tsdb-download-prefix $CBANDWIDTH_DOWNLOAD_PREFIX \\\n                  -tsdb-upload-prefix $CBANDWIDTH_UPLOAD_PREFIX \\\n                  -grafana-address $CBANDWIDTH_GRAFANA_ADDRESS \\\n                  -grafana-port $CBANDWIDTH_GRAFANA_PORT \\\n                  -nocontainer \\\n                  -debug;\n              done\n      terminationGracePeriodSeconds: 60\n```\n\nNow view the logs from the poller deployment, you should seem something like the following:\n\n```shell\n$ k logs cloud-bandwidth-poller-5857b44f5c-59dqn\ntime=\"2022-03-15T06:34:06Z\" level=info msg=\"no configuration file found, defaulting to command line arguments\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"Configuration as follows:\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Grafana Server = 192.168.122.1:2003\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Test Interval = 60sec\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Test Length = 4sec\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] TSDB download prefix = bandwidth.download\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] TSDB upload prefix = bandwidth.upload\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Server = 10.42.0.12:cluster-a\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Server = 10.42.1.19:cluster-b\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Server = 10.42.2.3:cluster-c\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Server = 10.42.3.3:cluster-d\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Binary = 5201\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[Config] Perf Server Port = 5201\"\ntime=\"2022-03-15T06:34:06Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -t 4 -f k -p 5201 -c 10.42.0.12 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:10Z\" level=info msg=\"Download results for endpoint 10.42.0.12 [cluster-a] -\u003e 6069361000 bps\"\ntime=\"2022-03-15T06:34:10Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.download.cluster-a 6069361000 1647326050\\n\"\ntime=\"2022-03-15T06:34:10Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -R -t 4 -f k -p 5201 -c 10.42.0.12 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:14Z\" level=info msg=\"Upload results for endpoint 10.42.0.12 [cluster-a] -\u003e 4474268000 bps\"\ntime=\"2022-03-15T06:34:14Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.upload.cluster-a 4474268000 1647326054\\n\"\ntime=\"2022-03-15T06:34:14Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -t 4 -f k -p 5201 -c 10.42.1.19 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:18Z\" level=info msg=\"Download results for endpoint 10.42.1.19 [cluster-b] -\u003e 61998884000 bps\"\ntime=\"2022-03-15T06:34:18Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.download.cluster-b 61998884000 1647326058\\n\"\ntime=\"2022-03-15T06:34:18Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -R -t 4 -f k -p 5201 -c 10.42.1.19 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:22Z\" level=info msg=\"Upload results for endpoint 10.42.1.19 [cluster-b] -\u003e 53034188000 bps\"\ntime=\"2022-03-15T06:34:22Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.upload.cluster-b 53034188000 1647326062\\n\"\ntime=\"2022-03-15T06:34:22Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -t 4 -f k -p 5201 -c 10.42.2.3 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:26Z\" level=info msg=\"Download results for endpoint 10.42.2.3 [cluster-c] -\u003e 7426153000 bps\"\ntime=\"2022-03-15T06:34:26Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.download.cluster-c 7426153000 1647326066\\n\"\ntime=\"2022-03-15T06:34:26Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -R -t 4 -f k -p 5201 -c 10.42.2.3 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:30Z\" level=info msg=\"Upload results for endpoint 10.42.2.3 [cluster-c] -\u003e 7214681000 bps\"\ntime=\"2022-03-15T06:34:30Z\" level=info msg=\"Sending the following msg to the tsdb: bandwidth.upload.cluster-c 7214681000 1647326070\\n\"\ntime=\"2022-03-15T06:34:30Z\" level=debug msg=\"[CMD] Running Command -\u003e [-c iperf3 -P 1 -t 4 -f k -p 5201 -c 10.42.3.3 | tail -n 3 | head -n1 | awk '{print $7}']\"\ntime=\"2022-03-15T06:34:34Z\" level=info msg=\"Download results for endpoint 10.42.3.3 [cluster-d] -\u003e 3853351000 bps\"\n```\n\nWIP - to be continued.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnerdalert%2Fkube-bandwidth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnerdalert%2Fkube-bandwidth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnerdalert%2Fkube-bandwidth/lists"}