{"id":20155277,"url":"https://github.com/redhat-cop/proactive-node-scaling-operator","last_synced_at":"2025-04-09T22:04:09.567Z","repository":{"id":44572750,"uuid":"335658796","full_name":"redhat-cop/proactive-node-scaling-operator","owner":"redhat-cop","description":"An operator to proactively scales Kubernetes clusters","archived":false,"fork":false,"pushed_at":"2024-04-02T19:06:22.000Z","size":342,"stargazers_count":26,"open_issues_count":16,"forks_count":12,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-09T22:04:03.712Z","etag":null,"topics":["container-cop","k8s-operator"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/redhat-cop.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2021-02-03T14:54:19.000Z","updated_at":"2024-12-04T08:48:36.000Z","dependencies_parsed_at":"2024-01-18T01:38:57.362Z","dependency_job_id":"08ffae18-3482-4366-a98d-5b44fb9f2820","html_url":"https://github.com/redhat-cop/proactive-node-scaling-operator","commit_stats":{"total_commits":61,"total_committers":9,"mean_commits":6.777777777777778,"dds":0.4590163934426229,"last_synced_commit":"e4b4aaf03cd5f914b65bb3d640bd4107c40206f2"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redhat-cop%2Fproactive-node-scaling-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redhat-cop%2Fproactive-node-scaling-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redhat-cop%2Fproactive-node-scaling-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redhat-cop%2Fproactive-node-scaling-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/redhat-cop","download_url":"https://codeload.github.com/redhat-cop/proactive-node-scaling-operator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248119296,"owners_count":21050755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["container-cop","k8s-operator"],"created_at":"2024-11-13T23:31:15.567Z","updated_at":"2025-04-09T22:04:09.550Z","avatar_url":"https://github.com/redhat-cop.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Proactive Node Scaling Operator\n\n![build status](https://github.com/redhat-cop/proactive-node-scaling-operator/workflows/push/badge.svg)\n[![Go Report Card](https://goreportcard.com/badge/github.com/redhat-cop/proactive-node-scaling-operator)](https://goreportcard.com/report/github.com/redhat-cop/proactive-node-scaling-operator)\n![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/redhat-cop/proactive-node-scaling-operator)\n\nThis operator makes the [cluster autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) more proactive. As of now the cluster auto scaler will create new nodes only when a pod is pending because it cannot be allocated due to lack of capacity. This is not a good user experience as the pending workload has to wait for several minutes as the new node is create and joins the cluster.\n\nThe Proactive Node Scaling Operator improves the user experience by allocating low priority pods that don't do anything. When the cluster is full and a new user pod is created the following happens:\n\n1. some of the low priority pods are de-scheduled to make room for the user pod, which can then be scheduled. The user workload does not have to wait in this case.\n\n2. the de-scheduled low priority pods are rescheduled and in doing so, trigger the cluster autoscaler to add new nodes.\n\nEssentially this operator allows you to trade wasted resources for faster response time.\n\nIn order for this operator to work correctly [pod priorities](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/) must be defined. The default name for the priority class used by this operator is \"proactive-node-autoscaling-pods\" and it should have the lowest possible priority, 0. To ensure your regular workloads get a normal priority you should also define a PriorityClass for those and set globalDefault to true.\n\nFor example:\n\n```yaml\napiVersion: scheduling.k8s.io/v1\nkind: PriorityClass\nmetadata:\n  name: proactive-node-autoscaling-pods\nvalue: 0\nglobalDefault: false\ndescription: \"This priority class is is the Priority class for Proactive Node Scaling.\"\n---\napiVersion: scheduling.k8s.io/v1\nkind: PriorityClass\nmetadata:\n  name: normal-workload\nvalue: 1000\nglobalDefault: true\ndescription: \"This priority classis the cluster default and should be used for normal workloads.\"\n```\n\nAlso for this operator to work the [cluster autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) must be active, see OpenShift instructions [here](https://docs.openshift.com/container-platform/4.6/machine_management/applying-autoscaling.html) on how to turn it on.\n\nTo activate the proactive autoscaling, a CR must be defined, here is an example:\n\n```yaml\napiVersion: redhatcop.redhat.io/v1alpha1\nkind: NodeScalingWatermark\nmetadata:\n  name: us-west-2a\nspec:\n  priorityClassName: proactive-node-autoscaling-pods\n  watermarkPercentage: 20\n  nodeSelector:\n    topology.kubernetes.io/zone: us-west-2a\n```\n\nThe `nodeSelector` selects the nodes observed by this operator, which are also the nodes on which the low priority pods will be scheduled. The nodes observed by the cluster autoscaler should coincide with the nodes selected by this operator CR.\n\nThe `watermarkPercentage` define the percentage of capacity of user workload that will be allocated to low priority pods. So in this example 20% of the user allocated capacity will be allocated via low priority pods. This also means that when the user workload reaches 80% capacity of the nodes selected by this CR (and the autoscaler), the cluster will start to scale.\n\n## Deploying the Operator\n\nThis is a cluster-level operator that you can deploy in any namespace, `proactive-node-scaling-operator` is recommended.\n\nIt is recommended to deploy this operator via [`OperatorHub`](https://operatorhub.io/), but you can also deploy it using [`Helm`](https://helm.sh/).\n\n### Deploying from OperatorHub\n\n\u003e **Note**: This operator supports being installed disconnected environments\n\nIf you want to utilize the Operator Lifecycle Manager (OLM) to install this operator, you can do so in two ways: from the UI or the CLI.\n\n#### Deploying from OperatorHub UI\n\n* If you would like to launch this operator from the UI, you'll need to navigate to the OperatorHub tab in the console. Before starting, make sure you've created the namespace that you want to install this operator in by running the following:\n\n```shell\noc new-project proactive-node-scaling-operator\n```\n\n* Once there, you can search for this operator by name: `proactive node scaling operator`. This will then return an item for our operator and you can select it to get started. Once you've arrived here, you'll be presented with an option to install, which will begin the process.\n* After clicking the install button, you can then select the namespace that you would like to install this to as well as the installation strategy you would like to proceed with (`Automatic` or `Manual`).\n* Once you've made your selection, you can select `Subscribe` and the installation will begin. After a few moments you can go ahead and check your namespace and you should see the operator running.\n\n![Proactive Node Scaling Operator](./media/proactive-node-scaling-operator.png)\n\n#### Deploying from OperatorHub using CLI\n\nIf you'd like to launch this operator from the command line, you can use the manifests contained in this repository by running the following:\n\noc new-project proactive-node-scaling-operator\n\n```shell\noc apply -f config/operatorhub -n proactive-node-scaling-operator\n```\n\nThis will create the appropriate OperatorGroup and Subscription and will trigger OLM to launch the operator in the specified namespace.\n\n### Deploying with Helm\n\nHere are the instructions to install the latest release with Helm.\n\n```shell\noc new-project proactive-node-scaling-operator\nhelm repo add proactive-node-scaling-operator https://redhat-cop.github.io/proactive-node-scaling-operator\nhelm repo update\nhelm install proactive-node-scaling-operator proactive-node-scaling-operator/proactive-node-scaling-operator\n```\n\nThis can later be updated with the following commands:\n\n```shell\nhelm repo update\nhelm upgrade proactive-node-scaling-operator proactive-node-scaling-operator/proactive-node-scaling-operator\n```\n\n### Disconnected deployment\n\nUse the `PausePodImage` field of the `NodeScalingWatermark` to specify an internally mirrored pause pod image, when running in a disconnected environment.\n\n## Metrics\n\nPrometheus compatible metrics are exposed by the Operator and can be integrated into OpenShift's default cluster monitoring. To enable OpenShift cluster monitoring, label the namespace the operator is deployed in with the label `openshift.io/cluster-monitoring=\"true\"`.\n\n```shell\noc label namespace \u003cnamespace\u003e openshift.io/cluster-monitoring=\"true\"\n```\n\n### Testing metrics\n\n```sh\nexport operatorNamespace=proactive-node-scaling-operator-local # or proactive-node-scaling-operator\noc label namespace ${operatorNamespace} openshift.io/cluster-monitoring=\"true\"\noc rsh -n openshift-monitoring -c prometheus prometheus-k8s-0 /bin/bash\nexport operatorNamespace=proactive-node-scaling-operator-local # or proactive-node-scaling-operator\ncurl -v -s -k -H \"Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)\" https://proactive-node-scaling-operator-controller-manager-metrics.${operatorNamespace}.svc.cluster.local:8443/metrics\nexit\n```\n\n## Development\n\n### Running the operator locally\n\n```shell\nmake install\nexport TEMPLATE_FILE_NAME=./config/templates/watermarkDeploymentTemplate.yaml\noc new-project proactive-node-scaling-operator-local\nkustomize build ./config/local-development | oc apply -f - -n proactive-node-scaling-operator-local\nexport token=$(oc serviceaccounts get-token 'proactive-node-scaling-controller-manager' -n proactive-node-scaling-operator-local)\noc login --token ${token}\nmake run ENABLE_WEBHOOKS=false\n```\n\n### Test helm chart locally\n\nDefine an image and tag. For example...\n\n```shell\nexport imageRepository=\"quay.io/redhat-cop/proactive-node-scaling-operator\"\nexport imageTag=\"$(git describe --tags --abbrev=0)\" # grabs the most recent git tag, which should match the image tag\n```\n\nDeploy chart...\n\n```shell\nmake helmchart\nhelm upgrade -i proactive-node-scaling-operator-helmchart-test charts/proactive-node-scaling-operator -n proactive-node-scaling-operator-local --set image.repository=${imageRepository} --set image.tag=${imageTag} --create-namespace\n```\n\nDelete...\n\n```shell\nhelm delete proactive-node-scaling-operator-helmchart-test -n proactive-node-scaling-operator-local\nkubectl delete -f charts/proactive-node-scaling-operator/crds/crds.yaml\n```\n\n### Building/Pushing the operator image\n\n```shell\nexport repo=raffaelespazzoli #replace with yours\ndocker login quay.io/$repo/proactive-node-scaling-operator\nmake docker-build IMG=quay.io/$repo/proactive-node-scaling-operator:latest\nmake docker-push IMG=quay.io/$repo/proactive-node-scaling-operator:latest\n```\n\n### Deploy to OLM via bundle\n\n```shell\nmake manifests\nmake bundle IMG=quay.io/$repo/proactive-node-scaling-operator:latest\noperator-sdk bundle validate ./bundle --select-optional name=operatorhub\nmake bundle-build BUNDLE_IMG=quay.io/$repo/proactive-node-scaling-operator-bundle:latest\ndocker login quay.io/$repo/proactive-node-scaling-operator-bundle\ndocker push quay.io/$repo/proactive-node-scaling-operator-bundle:latest\noperator-sdk bundle validate quay.io/$repo/proactive-node-scaling-operator-bundle:latest --select-optional name=operatorhub\noc new-project proactive-node-scaling-operator\noc label namespace proactive-node-scaling-operator openshift.io/cluster-monitoring=\"true\"\noperator-sdk cleanup proactive-node-scaling-operator -n proactive-node-scaling-operator\noperator-sdk run bundle --install-mode AllNamespaces -n proactive-node-scaling-operator quay.io/$repo/proactive-node-scaling-operator-bundle:latest\n```\n\n### Testing\n\nCreate the following resource:\n\n```shell\noc new-project proactive-node-scaling-operator-test\noc apply -f ./test/ai-ml-watermark.yaml -n proactive-node-scaling-operator-test\noc apply -f ./test/zone-watermark.yaml -n proactive-node-scaling-operator-test\n```\n\n### Releasing\n\n```shell\ngit tag -a \"\u003ctagname\u003e\" -m \"\u003ccommit message\u003e\"\ngit push upstream \u003ctagname\u003e\n```\n\nIf you need to remove a release:\n\n```shell\ngit tag -d \u003ctagname\u003e\ngit push upstream --delete \u003ctagname\u003e\n```\n\nIf you need to \"move\" a release to the current main\n\n```shell\ngit tag -f \u003ctagname\u003e\ngit push upstream -f \u003ctagname\u003e\n```\n\n### Cleaning up\n\n```shell\noperator-sdk cleanup proactive-node-scaling-operator -n proactive-node-scaling-operator\noc delete operatorgroup operator-sdk-og\noc delete catalogsource proactive-node-scaling-operator-catalog\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredhat-cop%2Fproactive-node-scaling-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fredhat-cop%2Fproactive-node-scaling-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredhat-cop%2Fproactive-node-scaling-operator/lists"}