https://github.com/makism/spark-on-k8s
Short guideline on setting up the spark-operator on minikube.
https://github.com/makism/spark-on-k8s
apache-spark kubeflow minikube prometheus telemetry
Last synced: 7 months ago
JSON representation
Short guideline on setting up the spark-operator on minikube.
- Host: GitHub
- URL: https://github.com/makism/spark-on-k8s
- Owner: makism
- Created: 2023-06-16T16:43:00.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-18T17:42:42.000Z (about 1 year ago)
- Last Synced: 2025-03-08T19:05:59.054Z (7 months ago)
- Topics: apache-spark, kubeflow, minikube, prometheus, telemetry
- Language: Python
- Homepage:
- Size: 310 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Preperation
## k8s
Install `minikube` and fetch the binary of `helm`.
## minikube
```bash
minikube start --memory 8192 --cpus 4
minikube kubectl -- get pods -A
minikube dashboard
```## Prometheus stack
```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus-comm prometheus-community/kube-prometheus-stack -f helms/prometheus.yaml
``````bash
minikube kubectl -- get secret prometheus-comm-grafana -o json | jq '.data | map_values(@base64d)'
``````bash
minikube kubectl -- port-forward -n default svc/prometheus-comm-grafana 9080:80
```## spark-operator
Install the `spark-operator` from the helm chart repository:
```bash
helm repo add spark-operator https://kubeflow.github.io/spark-operator
helm install spark-operator spark-operator/spark-operator \
--namespace spark-operator \
--create-namespacec
``````bash
minikube kubectl -- create serviceaccount spark --namespace=default
minikube kubectl -- create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
```## Prometheus ↔️ JMX
```bash
minikube kubectl -- apply -f deploy/prometheus/
```## Pushgateway
```bash
minikube kubectl -- apply -f deploy/pushgateway/
```# Submit a job
The currenct example uses a bare-basic `Dockerfile` to build a container image.
Feel free to update to match your needs but please don't forget to update the `image` field in the deployment file `deploy/spark/basic_pyspark_job.yaml`.You may submit the spark job using the following command:
```bash
minikube kubectl -- apply -f deploy/spark/basic_pyspark_job.yaml
```