{"id":32499392,"url":"https://github.com/clickhouse/kubenetmon","last_synced_at":"2025-10-27T15:53:09.229Z","repository":{"id":274699809,"uuid":"869920556","full_name":"ClickHouse/kubenetmon","owner":"ClickHouse","description":"kubenetmon is an open source Kubernetes network metering solution built at ClickHouse.","archived":false,"fork":false,"pushed_at":"2025-07-18T18:13:09.000Z","size":62458,"stargazers_count":154,"open_issues_count":7,"forks_count":7,"subscribers_count":37,"default_branch":"main","last_synced_at":"2025-10-13T08:43:43.958Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://kubenetmon.clickhouse.tech/index.yaml","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ClickHouse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-09T06:11:05.000Z","updated_at":"2025-10-01T20:39:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"0185f2f6-0ed0-4726-8d16-8a6e36e4a0c6","html_url":"https://github.com/ClickHouse/kubenetmon","commit_stats":null,"previous_names":["clickhouse/kubenetmon"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/ClickHouse/kubenetmon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2Fkubenetmon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2Fkubenetmon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2Fkubenetmon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2Fkubenetmon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ClickHouse","download_url":"https://codeload.github.com/ClickHouse/kubenetmon/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClickHouse%2Fkubenetmon/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281295811,"owners_count":26476759,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-27T02:00:05.855Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-27T15:53:03.907Z","updated_at":"2025-10-27T15:53:09.220Z","avatar_url":"https://github.com/ClickHouse.png","language":"Go","readme":"# kubenetmon\n[![Go Report Card](https://goreportcard.com/badge/github.com/ClickHouse/kubenetmon)](https://goreportcard.com/report/github.com/ClickHouse/kubenetmon)\n![Lint and test charts and code](https://github.com/ClickHouse/kubenetmon/actions/workflows/kubenetmon.yaml/badge.svg)\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n![GitHub Release](https://img.shields.io/github/v/release/ClickHouse/kubenetmon?display_name=release)\n\n### 📢 💛 News\n\u003e **Blog:** Read kubenetmon announcement blogpost and learn how it was built: [https://clickhouse.com/blog/kubenetmon-open-sourced](https://clickhouse.com/blog/kubenetmon-open-sourced)!\n\n## What is kubenetmon?\n`kubenetmon` is a service built and used at [ClickHouse](https://clickhouse.com) for Kubernetes data transfer metering in all 3 major cloud providers: AWS, GCP, and Azure.\n\n`kubenetmon` is packaged as a Helm chart with a Docker image. The chart is available at [https://kubenetmon.clickhouse.tech/index.yaml](https://kubenetmon.clickhouse.tech/index.yaml). See below for detailed usage instructions.\n\n## What can kubenetmon be used for?\nAt ClickHouse Cloud, we use `kubenetmon` to meter data transfer of all of our workloads running in Kubernetes. With the data `kubenetmon` collects and stores in ClickHouse, we are able to answer questions such as:\n1. How much cross-Availability Zone traffic are our workloads sending and which workloads are the largest talkers?\n2. How much traffic are we sending to S3?\n3. Which workloads open outbound connections, and which workloads only receive inbound connections?\n4. Are gRPC connections of our internal clients balanced across internal server replicas?\n5. What are our throughput needs and are we at risk of exhausting instance bandwidth limits imposed on us by CSPs?\n\n## How does kubenetmon work?\n### Components\n`kubenetmon` consists of two components:\n- `kubenetmon-agent` is a DaemonSet that collects information about connections on a node and forwards connection records to `kubenetmon-server` over gRPC. `kubenetmon-agent` gets connection information from Linux's conntrack (if you can use `iptables` with your CNI, you can use `kubenetmon`).\n- `kubenetmon-server` is a ReplicaSet that watches the state of the Kubernetes cluster, attributes connection records to Kubernetes workloads, and inserts the records into ClickHouse.\n\nThe final component, ClickHouse, which we use as the destination of our data and an analytics engine, can be self-hosted or run in [ClickHouse Cloud](clickhouse.cloud).\n\n## Using kubenetmon\n`kubenetmon` comes in two Helm charts, `kubenetmon-server` and `kubenetmon-agent`. Both use the same Docker image. Starting with `kubenetmon` is very easy.\n\nFirst, create a ClickHouse service in [ClickHouse Cloud](clickhouse.cloud). You can try it out for free with a $300 credit! In the new service (or an existing one, if you are already running ClickHouse), create `default.network_flows_0` table with this query (you can also find it in `test/network_flows_0.sql`):\n```\nCREATE TABLE default.network_flows_0\n(\n    `date` Date CODEC(Delta(2), ZSTD(1)),\n    `intervalStartTime` DateTime CODEC(Delta(4), ZSTD(1)),\n    `intervalSeconds` UInt16 CODEC(Delta(2), ZSTD(1)),\n    `environment` LowCardinality(String) CODEC(ZSTD(1)),\n    `proto` LowCardinality(String) CODEC(ZSTD(1)),\n    `connectionClass` LowCardinality(String) CODEC(ZSTD(1)),\n    `connectionFlags` Map(LowCardinality(String), Bool) CODEC(ZSTD(1)),\n    `direction` Enum('out' = 1, 'in' = 2) CODEC(ZSTD(1)),\n    `localCloud` LowCardinality(String) CODEC(ZSTD(1)),\n    `localRegion` LowCardinality(String) CODEC(ZSTD(1)),\n    `localCluster` LowCardinality(String) CODEC(ZSTD(1)),\n    `localCell` LowCardinality(String) CODEC(ZSTD(1)),\n    `localAvailabilityZone` LowCardinality(String) CODEC(ZSTD(1)),\n    `localNode` String CODEC(ZSTD(1)),\n    `localInstanceID` String CODEC(ZSTD(1)),\n    `localNamespace` LowCardinality(String) CODEC(ZSTD(1)),\n    `localPod` String CODEC(ZSTD(1)),\n    `localIPv4` IPv4 CODEC(Delta(4), ZSTD(1)),\n    `localPort` UInt16 CODEC(Delta(2), ZSTD(1)),\n    `localApp` String CODEC(ZSTD(1)),\n    `remoteCloud` LowCardinality(String) CODEC(ZSTD(1)),\n    `remoteRegion` LowCardinality(String) CODEC(ZSTD(1)),\n    `remoteCluster` LowCardinality(String) CODEC(ZSTD(1)),\n    `remoteCell` LowCardinality(String) CODEC(ZSTD(1)),\n    `remoteAvailabilityZone` LowCardinality(String) CODEC(ZSTD(1)),\n    `remoteNode` String CODEC(ZSTD(1)),\n    `remoteInstanceID` String CODEC(ZSTD(1)),\n    `remoteNamespace` LowCardinality(String) CODEC(ZSTD(1)),\n    `remotePod` String CODEC(ZSTD(1)),\n    `remoteIPv4` IPv4 CODEC(Delta(4), ZSTD(1)),\n    `remotePort` UInt16 CODEC(Delta(2), ZSTD(1)),\n    `remoteApp` String CODEC(ZSTD(1)),\n    `remoteCloudService` LowCardinality(String) CODEC(ZSTD(1)),\n    `bytes` UInt64 CODEC(Delta(8), ZSTD(1)),\n    `packets` UInt64 CODEC(Delta(8), ZSTD(1))\n)\nENGINE = SummingMergeTree((bytes, packets))\nPARTITION BY date\nPRIMARY KEY (date, intervalStartTime, direction, proto, localApp, remoteApp, localPod, remotePod)\nORDER BY (date, intervalStartTime, direction, proto, localApp, remoteApp, localPod, remotePod, intervalSeconds, environment, connectionClass, connectionFlags, localCloud, localRegion, localCluster, localCell, localAvailabilityZone, localNode, localInstanceID, localNamespace, localIPv4, localPort, remoteCloud, remoteRegion, remoteCluster, remoteCell, remoteAvailabilityZone, remoteNode, remoteInstanceID, remoteNamespace, remoteIPv4, remotePort, remoteCloudService)\nTTL intervalStartTime + toIntervalDay(90)\nSETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;\n```\n\nAll you now need is a Kubernetes cluster where you want to meter data transfer.\n\n(**Optional**) If you don't have a test k8s cluster, you can spin up a `kind` cluster using config in this repository like so:\n```\nkind create cluster --config=test/kind-config.yaml\n```\n\n(**Optional**) And if you don't have many workloads running in the cluster, you can install some mock services:\n```\nhelm repo add podinfo https://stefanprodan.github.io/podinfo\nhelm upgrade --install --wait backend --namespace default --set redis.enabled=true podinfo/podinfo\nhelm upgrade --install --wait frontend --namespace default --set redis.enabled=true podinfo/podinfo\n```\n\nNext, we create two namespaces:\n```\nkubectl create namespace kubenetmon-server\nkubectl create namespace kubenetmon-agent\n```\n\nLet's add this Helm repository:\n```\nhelm repo add kubenetmon https://kubenetmon.clickhouse.tech\n```\n\nWe now install `kubenetmon-server`. `kubenetmon-server` expects an environment, cluster name, cloud provider name (`aws`, `gcp`, or `azure`), and region, so we provide these. We are also going to supply connection credentials for our ClickHouse instance:\n```\nhelm install kubenetmon-server kubenetmon/kubenetmon-server \\\n--namespace kubenetmon-server \\\n--set region=us-west-2 \\\n--set cluster=cluster \\\n--set environment=development \\\n--set cloud=aws \\\n--set inserter.endpoint=pwlo4fffj6.eu-west-2.aws.clickhouse.cloud:9440 \\\n--set inserter.username=default \\\n--set inserter.password=etK0~PWR7DgRA \\\n--set deployment.replicaCount=1\n```\n\nWe see from the logs that the replica started and connected to our ClickHouse instance:\n```\n➜  ~ kubectl logs -f kubenetmon-server-6d5ff494fb-wvs8d -n kubenetmon-server\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:10Z\",\"message\":\"GOMAXPROCS: 2\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:10Z\",\"message\":\"GOMEMLIMIT: 2634022912\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:10Z\",\"message\":\"There are currently 18 pods, 5 nodes, and 3 services in the cluster cluster!\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:12Z\",\"message\":\"RemoteLabeler initialized with 43806 prefixes\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:12Z\",\"message\":\"Beginning to serve metrics on port :8883/metrics\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T20:55:12Z\",\"message\":\"Beginning to serve flowHandlerServer on port :8884\\n\"}\n```\n\nAll that's left is to deploy `kubenetmon-agent`, a DaemonSet that will track\nconnections on nodes. `kubenetmon-agent` relies on Linux's `conntrack` reporting\nbyte and packet counters; by default, this feature is most likely disabled, so\nyou need to enable it on the nodes with:\n```\n/bin/echo \"1\" \u003e /proc/sys/net/netfilter/nf_conntrack_acct\n```\n**This is an important step, don't skip it!**\n\nFor example, to test getting data transfer information from all nodes in the kind cluster, you can run:\n```\nfor node in $(kubectl get nodes -o name); do\n  kubectl node-shell ${node##node/} -- /bin/sh -c '/bin/echo \"1\" \u003e /proc/sys/net/netfilter/nf_conntrack_acct'\ndone\n```\n\nNodes are now ready to host `kubenetmon-agent`, so let's install it.\n```\nhelm install kubenetmon-agent kubenetmon/kubenetmon-agent --namespace kubenetmon-agent\n```\n\nLet's check the logs:\n```\n➜  ~ kubectl logs -f kubenetmon-agent-6dsnr -n kubenetmon-agent\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"GOMAXPROCS: 1\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"GOMEMLIMIT: 268435456\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Creating kubenetmon-server (kubenetmon-server.kubenetmon-server.svc.cluster.local:8884) gRPC client\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Connected to kubenetmon-server\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Confirmed that kubenetmon can retrieve conntrack packet and byte counters\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Beginning to serve metrics on port :8883/metrics\\n\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Starting flow collector\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"Starting collection loop with 5s interval\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:14Z\",\"message\":\"24 datapoints were accepted through the last stream\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:19Z\",\"message\":\"1 datapoints were accepted through the last stream\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:24Z\",\"message\":\"2 datapoints were accepted through the last stream\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:29Z\",\"message\":\"3 datapoints were accepted through the last stream\"}\n{\"level\":\"info\",\"time\":\"2025-01-23T21:00:34Z\",\"message\":\"3 datapoints were accepted through the last stream\"}\n```\n\nIf you see log lines such as `conntrack is reporting empty flow counters`, this means you didn't enable conntrack counters with `sysctl` as above. `kubenetmon-agent` performs a sanity check on startup to confirm that the counters are enabled; you can disable the check with `--set configuration.skipConntrackSanityCheck=true`, but in this case you won't get any data.\n\nIf we check `kubenetmon-server` logs again, we'll see it's sending data to ClickHouse:\n```\nInserted batch due to reaching max batch age\n```\n\nLet's now run a query in ClickHouse Cloud against our `network_flows_0` table:\n```\nSELECT localPod, remotePod, connectionClass, formatReadableSize(sum(bytes))\nFROM default.network_flows_0\nWHERE date = today() AND intervalStartTime \u003e NOW() - INTERVAL 10 MINUTES AND direction = 'out'\nGROUP BY localPod, remotePod, connectionClass\nORDER BY sum(bytes) DESC;\n```\n\n```\n\n+-----------------------------------------+-----------------------------------------+-----------------+--------------------------------+\n|                localPod                 |                remotePod                | connectionClass | formatReadableSize(sum(bytes)) |\n+-----------------------------------------+-----------------------------------------+-----------------+--------------------------------+\n| kubenetmon-server-6d5ff494fb-wvs8d      | ''                                      | INTER_REGION    | 166.71 KiB                     |\n| kubenetmon-server-6d5ff494fb-wvs8d      | ''                                      | INTRA_VPC       | 19.24 KiB                      |\n| backend-podinfo-redis-5d6c77b77c-t5vfh  | ''                                      | INTRA_VPC       | 2.46 KiB                       |\n| frontend-podinfo-redis-546897f5bc-hqsml | ''                                      | INTRA_VPC       | 2.46 KiB                       |\n| frontend-podinfo-5b58f98bbf-bfsw6       | frontend-podinfo-redis-546897f5bc-hqsml | INTRA_VPC       | 2.06 KiB                       |\n| backend-podinfo-7fc7494945-pcj8h        | backend-podinfo-redis-5d6c77b77c-t5vfh  | INTRA_VPC       | 2.05 KiB                       |\n| frontend-podinfo-redis-546897f5bc-hqsml | frontend-podinfo-5b58f98bbf-bfsw6       | INTRA_VPC       | 865.00 B                       |\n+-----------------------------------------+-----------------------------------------+-----------------+--------------------------------+\n```\n\nLooks like `kubenetmon-server` is sending some data to a different AWS region. This is accurate, because for this experiment we configured a ClickHouse instance in AWS and configured `kubenetmon-server` to think it's running in AWS us-west-2.\n\n## Testing\nTo run integration tests, run `make integration-test`. For unit tests, run `make test`. Note that unit tests for `kubenetmon-agent` can only be run on Linux (you need netlink for it).\n\n## Contributing\nTo contribute, simply open a pull request against `main` in the repository. Changes are very welcome.\n\n## Notes\n1. `kubenetmon` does not meter traffic from pods with host network.\n2. For a connection between two pods in the VPC, `kubenetmon-agent` will see each packet twice – once at the sender and once at the receiver. Use the `direction` filter if you want to filter for just one observation point.\n3. `kubenetmon` can't automatically detect NLBs, NAT Gateways, etc (you are welcome to contribute these changes!), but if these are easily identifiable in your infrastructure (for example, with a dedicated IP range), you can modify the code to populate the `connectionFlags` field as you see fit to record arbitrary data about your connections.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclickhouse%2Fkubenetmon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclickhouse%2Fkubenetmon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclickhouse%2Fkubenetmon/lists"}