{"id":18770865,"url":"https://github.com/victoriametrics/prometheus-benchmark","last_synced_at":"2025-04-04T14:02:52.836Z","repository":{"id":37896170,"uuid":"436599976","full_name":"VictoriaMetrics/prometheus-benchmark","owner":"VictoriaMetrics","description":"Benchmark for Prometheus-compatible systems","archived":false,"fork":false,"pushed_at":"2025-03-31T07:13:39.000Z","size":1391,"stargazers_count":171,"open_issues_count":2,"forks_count":28,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-04-04T14:01:29.874Z","etag":null,"topics":["benchmark","monitoring","prometheus","victoriametrics","vmagent","vmalert"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VictoriaMetrics.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-09T11:58:40.000Z","updated_at":"2025-04-02T14:33:12.000Z","dependencies_parsed_at":"2023-12-13T14:28:46.633Z","dependency_job_id":"6bbfe6d3-d7f6-4650-8ec6-2003b2d0feb3","html_url":"https://github.com/VictoriaMetrics/prometheus-benchmark","commit_stats":{"total_commits":46,"total_committers":9,"mean_commits":5.111111111111111,"dds":0.5,"last_synced_commit":"4d42f5dba3e473368c96b554ef22eacc4c04b992"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VictoriaMetrics%2Fprometheus-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VictoriaMetrics%2Fprometheus-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VictoriaMetrics%2Fprometheus-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VictoriaMetrics%2Fprometheus-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VictoriaMetrics","download_url":"https://codeload.github.com/VictoriaMetrics/prometheus-benchmark/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247190233,"owners_count":20898700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","monitoring","prometheus","victoriametrics","vmagent","vmalert"],"created_at":"2024-11-07T19:21:58.049Z","updated_at":"2025-04-04T14:02:52.819Z","avatar_url":"https://github.com/VictoriaMetrics.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Prometheus benchmark\n\nPrometheus-benchmark allows testing data ingestion and querying performance\nfor Prometheus-compatible systems on production-like workload.\n\nPrometheus-benchmark provides the following features:\n\n- It generates production-like workload for both data ingestion and querying paths:\n  - It generates write workload from real [node_exporter](https://github.com/prometheus/node_exporter) metrics.\n    This is the most frequently used exporter for Prometheus metrics.\n  - It generates read workload from typical alerting rules for `node_exporter` metrics - see [chart/files/alerts.yaml](chart/files/alerts.yaml).\n- It allows generating [time series churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)\n  via [scrapeConfigUpdatePercent](https://github.com/VictoriaMetrics/prometheus-benchmark/blob/f6a69052413618c607758d5469e43e508792aff7/chart/values.yaml#L30)\n  and [scrapeConfigUpdateInterval](https://github.com/VictoriaMetrics/prometheus-benchmark/blob/f6a69052413618c607758d5469e43e508792aff7/chart/values.yaml#L38)\n  options. The churn rate is typical for Kubernetes monitoring.\n- Multiple systems can be tested simultaneously - just add multiple named entries\n  under `remoteStorages` section at [chart/values.yaml](chart/values.yaml).\n\nThe following systems can be tested with prometheus-benchmark:\n\n- [Single-node version of VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html):\n  - [How to push data to VictoriaMetrics](https://docs.victoriametrics.com/#prometheus-setup)\n  - [How to query data from VictoriaMetrics](https://docs.victoriametrics.com/url-examples.html#apiv1query)\n- [Cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html):\n  - [How to push and query data in cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format)\n- [Grafana Mimir](https://grafana.com/oss/mimir/):\n  - [How to push data to Mimir](https://grafana.com/docs/mimir/latest/operators-guide/reference-http-api/#remote-write)\n  - [How to query data from Mimir](https://grafana.com/docs/mimir/latest/operators-guide/reference-http-api/#instant-query)\n- [Cortex](https://github.com/cortexproject/cortex):\n  - [How to push data to Cortex](https://cortexmetrics.io/docs/api/#remote-write)\n  - [How to query data from Cortex](https://cortexmetrics.io/docs/api/#instant-query)\n- [Thanos](https://github.com/thanos-io/thanos/):\n  - [How to push data to Thanos](https://thanos.io/tip/components/receive.md/)\n  - [How to query data from Thanos](https://thanos.io/tip/components/query.md/)\n\n## How does it work?\n\nThe prometheus-benchmark scrapes metrics from [node_exporter](https://github.com/prometheus/node_exporter)\nand pushes the scraped metrics to the configured Prometheus-compatible remote storage systems.\nThese systems must support [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)\nfor measuring data ingestion performance. Optionally these systems may support\n[Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries) for measuring query performance.\n\n\u003cimg src=\"prometheus-benchmark-architecture.excalidraw.png\" width=\"600\" alt=\"Benchmark architecture\"\u003e\n\nThe helm chart deploys the following pods:\n\n- `vmagent` with the following containers:\n  - [nodeexporter](https://github.com/prometheus/node_exporter) - collects real metrics from Kubernetes node where it runs.\n  - [nginx](https://nginx.org/) - caches responses from `nodeexporter` for 1 second in order to reduce load on it\n    when scraping big number of targets.\n  - [vmagent-config-updater](services/vmagent-config-updater/README.md) - generates config for target scraping.\n    It is also responsible for generating time series churn rate via periodic updating of the generated targets.\n  - [vmagent](https://docs.victoriametrics.com/vmagent.html) - scrapes `nodeexporter` metrics via `nginx`\n    for targets generated by `vmagent-config-updater`.\n- `vmalert` with the following containers:\n  - [vmalert](https://docs.victoriametrics.com/vmalert.html) - periodically executes [these alerting rules](chart/files/alerts.yaml)\n    (aka read queries) against the testes remote storage.\n  - [alertmanager](https://github.com/prometheus/alertmanager) - receives notifications from `vmalert`.\n    It is configured as a blackhole for the received notifications.\n  `vmalert` pod is optional - it is used for generating read query load.\n- `vmsingle` - this pod runs a [single-node VictoriaMetrics](https://docs.victoriametrics.com/), which collects metrics from `vmagent` and `vmalert` pods,\n  so they could be analyzed during benchmark execution.\n\n## Articles\n\n- [Benchmarking Prometheus-compatible time series databases](https://victoriametrics.com/blog/remote-write-benchmark/)\n- [Monitoring benchmark: how to generate 100 million samples/s of production-like data](https://victoriametrics.com/blog/benchmark-100m/)\n- [Grafana Mimir and VictoriaMetrics: performance tests](https://victoriametrics.com/blog/mimir-benchmark/)\n\n## How to run\n\nIt is expected that [Helm3](https://helm.sh/docs/intro/install/) is already installed\nand configured to communicate with Kubernetes cluster where the prometheus-benchmark should run.\n\nCheck out the prometheus-benchmark sources:\n\n```bash\ngit clone https://github.com/VictoriaMetrics/prometheus-benchmark\ncd prometheus-benchmark\n```\n\nThen edit the [chart/values.yaml](chart/values.yaml) with the desired config params.\nThen optionally edit the [chart/files/alerts.yaml](chart/files/alerts.yaml)\nwith the desired queries to execute at remote storage systems.\nThen run the following command in order to install the prometheus-benchmark\ncomponents in Kubernetes and start the benchmark:\n\n```bash\nmake install\n```\n\nRun the following command in order to inspect the metrics collected by the benchmark:\n\n```bash\nmake monitor\n```\n\nAfter that go to `http://localhost:8428/targets` in order to see which metrics are collected by the benchmark.\nSee [monitoring docs](#monitoring) for details.\n\nAfter the benchmark is complete, run the following command for removing prometheus-benchmark components from Kubernetes:\n\n```bash\nmake delete\n```\n\nBy default the `prometheus-benchmark` is deployed in `vm-benchmark` Kubernetes namespace.\nThe namespace can be overridden via `NAMESPACE` environment variable.\nFor example, the following command starts the `prometheus-benchmark` chart in `foobar` k8s namespace:\n\n```bash\nNAMESPACE=foobar make install\n```\n\nSee the [Makefile](Makefile) for more details on available `make` commands.\n\n## Monitoring\n\nThe benchmark collects various metrics from its components. These metrics\nare available for querying at `http://localhost:8428/vmui` after running `make monitor` command.\nThe following metrics might be interesting to look at during the benchmark:\n\n- Data ingestion rate:\n\n```metricsql\nsum(rate(vm_promscrape_scraped_samples_sum{job=\"vmagent\"})) by (remote_storage_name)\n```\n\n- 99th percentile for the duration to execute queries at [chart/files/alerts.yaml](chart/files/alerts.yaml):\n\n```metricsql\nmax(vmalert_iteration_duration_seconds{quantile=\"0.99\",job=\"vmalert\"}) by (remote_storage_name)\n```\n\n- 99th percentile for the duration to push the collected data to the configured\n  remote storage systems at [chart/values.yaml](chart/values.yaml):\n\n```metricsql\nhistogram_quantile(0.99,\n  sum(increase(vmagent_remotewrite_duration_seconds_bucket{job=\"vmagent\"}[5m])) by (vmrange,remote_storage_name)\n)\n```\n\nIt is recommended also to check the following metrics in order to verify whether the configured remote storage is capable to handle the configured workload:\n\n- The number of dropped data packets when sending them to the configured remote storage.\n  If the value is bigger than zero, then the remote storage refuses to accept incoming data.\n  It is recommended inspecting remote storage logs and vmagent logs in this case.\n\n```metricsql\nsum(rate(vmagent_remotewrite_packets_dropped_total{job=\"vmagent\"})) by (remote_storage_name)\n```\n\n- The number of retries when sending data to remote storage. If the value is bigger than zero,\n  then this is a sign that the remote storage cannot handle the workload.\n  It is recommended inspecting remote storage logs and vmagent logs in this case.\n\n```metricsql\nsum(rate(vmagent_remotewrite_retries_count_total{job=\"vmagent\"})) by (remote_storage_name)\n```\n\n- The amounts of pending data at vmagent side, which isn't sent to remote storage yet.\n  If the graph grows, then the remote storage cannot keep up with the given data ingestion rate.\n  Sometimes increasing the `writeConcurrency` at [chart/values.yaml](chart/values.yaml)\n  may help if there is a high network latency between vmagent at prometheus-benchmark\n  and the remote storage.\n\n```metricsql\nsum(vm_persistentqueue_bytes_pending{job=\"vmagent\"}) by (remote_storage_name)\n```\n\n- The number of errors when executing queries from [chart/files/alerts.yaml](chart/files/alerts.yaml).\n  If the value is bigger than zero, then the remote storage cannot handle the query workload.\n  It is recommended inspection remote storage logs and vmalert logs in this case.\n\n```metricsql\nsum(rate(vmalert_execution_errors_total{job=\"vmalert\"})) by (remote_storage_name)\n```\n\nThe `prometheus-benchmark` doesn't collect metrics from the tested remote storage systems.\nIt is expected that a separate monitoring is set up for whitebox monitoring\nof the tested remote storage systems.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvictoriametrics%2Fprometheus-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvictoriametrics%2Fprometheus-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvictoriametrics%2Fprometheus-benchmark/lists"}