Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mesosphere/kudo-spark-operator
KUDO Spark Operator
https://github.com/mesosphere/kudo-spark-operator
ksphere
Last synced: 3 months ago
JSON representation
KUDO Spark Operator
- Host: GitHub
- URL: https://github.com/mesosphere/kudo-spark-operator
- Owner: mesosphere
- License: apache-2.0
- Created: 2019-09-12T15:10:54.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-04-29T15:33:05.000Z (over 3 years ago)
- Last Synced: 2024-05-22T14:32:04.181Z (6 months ago)
- Topics: ksphere
- Language: Shell
- Homepage: https://kudo.dev/
- Size: 624 KB
- Stars: 10
- Watchers: 66
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# KUDO Spark Operator
# Developing
### Prerequisites
Required software:
* Docker
* GNU Make 4.2.1 or higher
* sha1sum
* [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
* [KUDO CLI Plugin](https://kudo.dev/docs/#install-kudo-cli) 0.15.0 or higherFor test cluster provisioning and Stub Universe artifacts upload valid AWS access credentials required:
* `AWS_PROFILE` **or** `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables should be providedFor pulling private repos, a GitHub token is required:
* generate [GitHub token](https://help.github.com/en/articles/creating-a-personal-access-token-for-the-command-line)
and export environment variable with token contents: `export GITHUB_TOKEN=`
* or save the token either to `/shared/data-services-kudo/.github_token` or to `~/.ds_kudo_github_token`### Build steps
GNU Make is used as the main build tool and includes the following main targets:
* `make cluster-create` creates a Konvoy or MKE cluster
* `make cluster-destroy` creates a Konvoy or MKE cluster
* `make clean-all` removes all artifacts produced by targets from local filesystem
* `make docker-spark` builds Spark base image based on Apache Spark 3.0.0
* `make docker-operator` builds Operator image and Spark base image if it's not built
* `make docker-builder` builds image with required tools to run tests
* `make docker-push` publishes Spark base image and Spark Operator image to DockerHub
* `make test` runs tests suite
* `make clean-docker` removes all files, created by `make` during `docker build` goals executionA typical workflow looks as following:
```
make clean-all
make cluster-create
make docker-push
make test
make cluster-destroy
```To run tests on a pre-existing cluster with specified operator and spark images, set KUBECONFIG, SPARK_IMAGE_FULL_NAME and OPERATOR_IMAGE_FULL_NAME variables
```
make test KUBECONFIG=$HOME/.kube/config \
SPARK_IMAGE_FULL_NAME=mesosphere/spark:spark-3.0.0-hadoop-2.9-k8s \
OPERATOR_IMAGE_FULL_NAME=mesosphere/kudo-spark-operator:3.0.0-1.1.0
```# Package and Release
Release process is semi-automated and based on Github Actions. To make a new release:
- Copy manifsets and docs for KUDO Spark Operator to the [Operators repo](https://github.com/kudobuilder/operators/tree/master/repository/spark), raise a PR and make sure the CI check is successful
- After the PR is merged, create and push a new tag, e.g:
```
git tag -a v3.0.0-1.1.0 -m "KUDO Spark Operator 3.0.0-1.1.0 release"
```
Pushing the new tag will trigger [release workflow](.github/workflows/release.yml), will build the operator package with KUDO,
create a new GH release draft with the package attached to it.- Verify the new release (draft) is created and operator package attached as a release asset
- Add the release notes and publish the release
# Installing and using Spark Operator### Prerequisites
* Kubernetes cluster up and running
* `kubectl` configured to work with provisioned cluster
* [KUDO CLI Plugin](https://kudo.dev/docs/#install-kudo-cli) 0.15.0 or higher### Installation
To install KUDO Spark Operator, run:
```bash
make install
```This make target runs [install_operator.sh](scripts/install_operator.sh) script which will install Spark Operator and
create Spark Driver roles defined in [specs/spark-driver-rbac.yaml](specs/spark-driver-rbac.yaml). By default, Operator
and Driver roles will be created and configured to run in namespace `spark-operator`. To change the namespace,
provide `NAMESPACE` parameter to make:
```bash
make install NAMESPACE=test-namespace
```### Submitting Spark Application
To submit Spark Application and check its status run:
```bash
#switch to operator namespace, e.g.
kubens spark-operator# create Spark application
kubectl create -f specs/spark-application.yaml# list applications
kubectl get sparkapplication# check application status
kubectl describe sparkapplication mock-task-runner
```To get started with your app monitoring, please, see also [monitoring documentation](kudo-spark-operator/docs/latest/monitoring.md)