Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/project-codeflare/instascale
On-demand Kubernetes/OpenShift cluster scaling and aggregated resource provisioning
https://github.com/project-codeflare/instascale
Last synced: about 1 month ago
JSON representation
On-demand Kubernetes/OpenShift cluster scaling and aggregated resource provisioning
- Host: GitHub
- URL: https://github.com/project-codeflare/instascale
- Owner: project-codeflare
- License: apache-2.0
- Created: 2022-12-12T17:51:52.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-21T09:35:33.000Z (10 months ago)
- Last Synced: 2024-02-21T10:33:53.061Z (10 months ago)
- Language: Go
- Homepage:
- Size: 66.3 MB
- Stars: 7
- Watchers: 5
- Forks: 19
- Open Issues: 42
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
#
[![Go](https://github.com/project-codeflare/instascale/actions/workflows/go.yml/badge.svg?branch=main)](https://github.com/project-codeflare/instascale/actions/workflows/go.yml)
InstaScale is a controller that works with Multi-cluster-app-dispatcher (MCAD) to get aggregated resources available in the kubernetes cluster without creating pending pods. It uses [machinesets](https://github.com/openshift/machine-api-operator) to launch instances on cloud provider to be added to the Kubernetes cluster.
Key features:
- Acquires aggregated heterogenous instances needed for workload execution.
- Does not clog Kubernetes control plane.
- Works with your Kubernetes scheduling system to schedule pods on aggregated resources.
- Terminates instances on workload completion.# InstaScale and MCAD interaction
- User submits Multi GPU job(s)
- Job(s) lands in MCAD queue
- When resources are not available it triggers scaling i.e. calls InstaScale
- InstaScale looks at resource requests specified by the user and matches those with the desired Machineset(s) to get nodes.
- After InstaScal-ing, when aggregate resources are available to run the job MCAD dispatches the job.
- When job completes, resources obtained for the job are released.# Development
## Pre-requisites
- Installed Go version 1.19
- Running OpenShift cluster## Building
- To build locally : `make build`
- To run locally : `make run`
## Image creation
- To build and release a docker image for controller : `make IMG=quay.io/project-codeflare/instascale: image-build image-push`
- Note that the other contents of the Makefile (as well as the `config` and `bin` dirs) exist for future operator development, and are not currently utilized
## Deployment
- Deploy InstaScale (latest) using: `make deploy`- Optionally, to deploy a custom image of InstaScale you can use the `custom-deploy` make target to build, push, and deploy your image of InstaScale on your Kubernetes cluster:
```
make custom-deploy ENGINE= IMG=quay.io//instascale:
```
Note: This assumes you are logged into your quay.io account on your local machine, and your kubeconfig is pointing to the cluster you want to deploy InstaScale on.## Running an InstaScale deployment locally with Visual Studio Code
- Deploy MCAD using steps [here](https://github.com/project-codeflare/multi-cluster-app-dispatcher/blob/main/doc/deploy/deployment.md).- In Visual Studio Code update `.vscode/launch.json` so that `"KUBECONFIG"` points to your Kubernetes config file.
- If you changed the namespace in `config/default/kustomization.yaml` update the `args[]` in `launch.json` to include `"--configs-namespace=", "--ocm-secret-namespace="`.
- You can now run the local deployment with the debugger.
## Running locally with a OSD cluster
Running InstaScale locally to an OSD cluster requires extra steps from the above.
- Add the `instascale-ocm-secret`
- Get your API token from [here](https://console.redhat.com/openshift/token)
- Navigate to Workloads -> secrets
- Select your project to `instascale-system`
- Click Create -> Key/value secret
- Secret name: `instascale-ocm-secret`
- Key: `token`
- Value: ``
- Click Create
## Scaling Machines with a Self-Managed OCP Cluster using AWS
To scale machines of a certain type you need to create a `MachineSet` by following this guide [here](https://docs.openshift.com/container-platform/4.11/machine_management/creating_machinesets/creating-machineset-aws.html).
- On your Cluster Dashboard go to `Compute` -> `Create MachineSet`.
- Paste in your new `MachineSet` you created based off of the guide and click `Create`.
- Your `MachineSet` should now appear.
- Attempt to scale machines of the same machine type as your `MachineSet` template using `InstaScale`.
- The `MachineSet` replicas should increase by the number of replicas you have specified.## Testing
Run tests with command:
```
go test -v ./controllers/```
## Release process
Prerequisite:
- Build and release [MCAD](https://github.com/project-codeflare/multi-cluster-app-dispatcher)
- Make sure that MCAD version is published on [Go package site](https://pkg.go.dev/github.com/project-codeflare/multi-cluster-app-dispatcher?tab=versions)1. Run [instascale-release.yml](https://github.com/project-codeflare/instascale/actions/workflows/instascale-release.yml) action.
2. Verify that [instascale-release.yml](https://github.com/project-codeflare/instascale/actions/workflows/instascale-release.yml) action passed successfully.