{"id":26179069,"url":"https://github.com/plumber-cd/argocd-autoscaler","last_synced_at":"2025-08-21T05:35:17.216Z","repository":{"id":273341876,"uuid":"919335425","full_name":"plumber-cd/argocd-autoscaler","owner":"plumber-cd","description":"Autoscaling ArgoCD","archived":false,"fork":false,"pushed_at":"2025-06-11T23:53:10.000Z","size":583,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-12T00:33:33.421Z","etag":null,"topics":["argo","argocd","autoscaling","hpa"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/plumber-cd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["dee-kryvenko","plumber-cd"]}},"created_at":"2025-01-20T07:52:17.000Z","updated_at":"2025-06-11T23:36:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"03b8eb7d-989c-4739-a979-0242e2d8044d","html_url":"https://github.com/plumber-cd/argocd-autoscaler","commit_stats":null,"previous_names":["plumber-cd/argocd-autoscaler"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/plumber-cd/argocd-autoscaler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plumber-cd%2Fargocd-autoscaler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plumber-cd%2Fargocd-autoscaler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plumber-cd%2Fargocd-autoscaler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plumber-cd%2Fargocd-autoscaler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/plumber-cd","download_url":"https://codeload.github.com/plumber-cd/argocd-autoscaler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plumber-cd%2Fargocd-autoscaler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271430919,"owners_count":24758423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["argo","argocd","autoscaling","hpa"],"created_at":"2025-03-11T21:48:49.631Z","updated_at":"2025-08-21T05:35:17.177Z","avatar_url":"https://github.com/plumber-cd.png","language":"Go","funding_links":["https://github.com/sponsors/dee-kryvenko","https://github.com/sponsors/plumber-cd"],"categories":[],"sub_categories":[],"readme":"# argocd-autoscaler\n\nThis controller can automatically partition shards (destination kubernetes clusters) to ArgoCD Application Controllers,\ndetermine how many App Controller replicas are needed for that partitioning, and scale App Controllers accordingly.\n\nThere are three levels of resolution that I can explain how this works and how to use it.\n\n## TL;DR\n\nLevel one aka TL;DR version: it will look at prometheus metrics from App Controllers,\ndetermine load index of each destination cluster,\nand partition clusters to replicas in the most efficient way.\nAnd, scale the App Controllers accordingly, too. You can install it using kustomize in three simple steps:\n\n1. Grab CRDs from [./config/crd](./config/crd)\n1. Grab autoscaler from [./config/default](./config/default)\n1. Grab our default opinionated scaling strategy from [./config/default-scaling-strategy](./config/default-scaling-strategy)\n\nSample `kustomization.yaml` that you may want to use:\n\n```yaml\nnamespace: argocd\nresources:\n- github.com/plumber-cd/argocd-autoscaler/config/crd?ref=vX.X.X\n- github.com/plumber-cd/argocd-autoscaler/config/default?ref=vX.X.X\n- github.com/plumber-cd/argocd-autoscaler/config/default-scaling-strategy?ref=vX.X.X\npatches:\n- target:\n    kind: Deployment\n    name: argocd-autoscaler\n  patch: |-\n    - op: add\n      path: /spec/template/spec/containers/0/resources\n      value:\n        resources:\n          requests:\n            cpu: 10m\n            memory: 64Mi\n          limits:\n            memory: 128Mi\n```\n\nNote that it does only partitioning and horizontal scaling.\nIt is still on you to appropriately scale App Controllers CPU/mem based on how the load is getting distributed.\nThis autoscaler aims to keep all replicas utilized equally, so - usual scaling tools would work just fine.\n\n## Advanced configuration\n\n### Customize scaling strategy\n\nYou can use [./config/defaul-scaling-strategy](./config/default-scaling-strategy) as a starting point.\nThe things you will likely want to customize are `poll.yaml` and `load-index.yaml` and `evaluation.yaml`.\n\nThe `poll.yaml` uses `PrometheusPoll` and controls what queries are made to Prometheus.\nBased on your preference, you may adjust what works for you.\nNote that queries must return a single value.\n\nThe `load-index.yaml` uses `WeightedPNormLoadIndex` to calculate load index using normalized polling results with weights.\nHere, you may want to adjust how much each individual metric contributes to the result.\n\nThe `evaluation.yaml` uses `MostWantedTwoPhaseHysteresisEvaluation` to observe and promote partitioning.\nHere, you may want to customize how long should it be observing before electing and applying a re-shuffle.\n\nCustomization recommendations based on defaults are as follows.\n\nGenerally speaking, you may want to follow default opinionated `quantile_over_time` sampling and maybe only modify\nwhat are quantiles sampled on and their weights.\nOtherwise - you want to keep in mind to keep sampling at high enough quantile and for longer ranges.\nThis helps to \"remember\" spikes for longer,\nand have them reflected in the load index later.\nYou will receive a mostly flat load index, and that's what you are aiming for.\nOtherwise, spikes will not inform overall partitioning decisions in the evaluation phase,\nand (provided that at idle all shards are at about the same rate of utilization overall) - you will most likely\nend up with partitioning of one shard per one replica.\nWhich is totally an option if this is literally all you care about - to make it automatically schedule new replicas\nfor new clusters.\n\nIt may be tempting to remove evaluation piece and use load index directly at the scaling,\nif you want to make scaling more reactive and instantaneous.\nWhich is fine if you are aiming for one shard = one replica,\nbut doing that otherwise may result in App Controller restarts every poll cycle and instability.\n\nThere is a middle ground which is to still use evaluation,\nuse `max` over a minimal possible polling period to get latest values in queries,\nand dial down evaluation stabilization period to much shorter value that you are comfortable with how often it can restart.\n\nYou can use [./grafana/argocd-autoscaler.json](./grafana/argocd-autoscaler.json) dashboard to aid you in figuring out optimal values for you.\nYou can deploy everything but `scaler.yaml`, which will essentially make it a dry-run.\nYou'll see in Grafana what it would want to do under current scaling strategy without it actually doing anything.\nFor that, of course, you'd need to deploy a `ServiceMonitor` (see below).\n\nLastly, you may want to customize `scaler.yaml` to adjust how it applies changes to the STS.\nFor that - refer to [this section of the design document](./DESIGN.md#replica-set-default) for more information.\n\n### Monitor Autoscaler with Prometheus\n\nInstead of `github.com/plumber-cd/argocd-autoscaler/config/default`\nyou can use `github.com/plumber-cd/argocd-autoscaler/config/default-with-monitoring`.\nThat deploys additional metrics service and a service monitor.\nHere's a sample `kustomization.yaml` that you may want to use:\n\n```yaml\nnamespace: argocd\nresources:\n- github.com/plumber-cd/argocd-autoscaler/config/crd?ref=vX.X.X\n- github.com/plumber-cd/argocd-autoscaler/config/default-with-monitoring?ref=vX.X.X\n- github.com/plumber-cd/argocd-autoscaler/config/default-scaling-strategy?ref=vX.X.X\npatches:\n- target:\n    kind: Deployment\n    name: argocd-autoscaler\n  path: argocd-application-controller-remove-replicas-patch.yaml\n  patch: |-\n    - op: add\n      path: /spec/template/spec/containers/0/resources\n      value:\n        resources:\n          requests:\n            cpu: 10m\n            memory: 64Mi\n          limits:\n            memory: 128Mi\n- target:\n    kind: NetworkPolicy\n    name: argocd-autoscaler-allow-metrics-traffic\n  patch: |-\n    - op: replace\n      path: /spec/ingress/0/from/0/namespaceSelector/matchLabels\n      value:\n        kubernetes.io/metadata.name: \u003cnamespace where your prometheus lives\u003e\n- target:\n    kind: ServiceMonitor\n    name: argocd-autoscaler\n  path: argocd-autoscaler-metrics-network-policy-patch.yaml\n  patch: |-\n    - op: add\n      path: /metadata/labels\n      value:\n        \u003cyour prometheus selector label\u003e: \u003cyour prometheus selector label value\u003e\n```\n\nAt [./grafana/argocd-autoscaler.json](./grafana/argocd-autoscaler.json) you may find a dashboard that I am using.\n\nThere are also other dashboards generated by `kubebuilder` - runtime telemetry is good but custom metrics one is useless.\n\n### Secured Autoscaler with Prometheus\n\nGenerally speaking, for securing communications between prometheus and `/metrics` endpoint, typically means:\n\n1. Network Policy that allows only Prometheus to access the `/metrics` endpoint.\n1. Put `/metrics` endpoint behind authentication.\n1. Apply TLS encryption for `/metrics` endpoint.\n\nNetwork policy is already included in the previous example. For the rest - you do need additional RBAC and CertManager.\n\nOnce you make sure these prerequisites are deployed, you may want to use [./config/default-secured](./config/default-secured).\nThis will apply authentication and TLS to the `/metrics` endpoint,\nand it will also modify ServiceMonitor to inform Prometheus how to securely scrape it.\n\n## Hardcore\n\nI have a pretty detailed document at [./DESIGN.md](./DESIGN.md) that explains all the implementation details.\nIt can help you to customize everything and anything or hopefully even contribute more ideas that we can implement.\nAt the current stage of the project, we really aim to provide a framework rather than a end result solution.\nAlthough we do include default opinionated scaling strategy - only you can define a strategy that would work best for yourself.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplumber-cd%2Fargocd-autoscaler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplumber-cd%2Fargocd-autoscaler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplumber-cd%2Fargocd-autoscaler/lists"}