{"id":14987750,"url":"https://github.com/apache/spark-kubernetes-operator","last_synced_at":"2025-04-05T16:03:43.199Z","repository":{"id":230458240,"uuid":"779091052","full_name":"apache/spark-kubernetes-operator","owner":"apache","description":"Apache Spark Kubernetes Operator","archived":false,"fork":false,"pushed_at":"2025-03-26T15:10:43.000Z","size":768,"stargazers_count":106,"open_issues_count":0,"forks_count":27,"subscribers_count":28,"default_branch":"main","last_synced_at":"2025-03-29T15:02:52.540Z","etag":null,"topics":["java","kubernetes","spark"],"latest_commit_sha":null,"homepage":"https://spark.apache.org/","language":"Java","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-29T02:44:06.000Z","updated_at":"2025-03-26T18:23:40.000Z","dependencies_parsed_at":"2024-03-29T21:42:50.790Z","dependency_job_id":"1f476bbd-73c8-4801-802e-5e4a2ad4c460","html_url":"https://github.com/apache/spark-kubernetes-operator","commit_stats":{"total_commits":134,"total_committers":6,"mean_commits":"22.333333333333332","dds":"0.26865671641791045","last_synced_commit":"b18776458b8f26db8e43f6ddb9275f3cbdbafffe"},"previous_names":["apache/spark-kubernetes-operator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-kubernetes-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-kubernetes-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-kubernetes-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fspark-kubernetes-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/spark-kubernetes-operator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247361601,"owners_count":20926641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["java","kubernetes","spark"],"created_at":"2024-09-24T14:15:20.451Z","updated_at":"2025-04-05T16:03:43.164Z","avatar_url":"https://github.com/apache.png","language":"Java","readme":"# Apache Spark K8s Operator\n\n[![GitHub Actions Build](https://github.com/apache/spark-kubernetes-operator/actions/workflows/build_and_test.yml/badge.svg)](https://github.com/apache/spark-kubernetes-operator/actions/workflows/build_and_test.yml)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Repo Size](https://img.shields.io/github/repo-size/apache/spark-kubernetes-operator)](https://img.shields.io/github/repo-size/apache/spark-kubernetes-operator)\n\nApache Spark™ K8s Operator is a subproject of [Apache Spark](https://spark.apache.org/) and\naims to extend K8s resource manager to manage Apache Spark applications via\n[Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/).\n\n## Building Spark K8s Operator\n\nSpark K8s Operator is built using Gradle.\nTo build, run:\n\n```bash\n$ ./gradlew build -x test\n```\n\n## Running Tests\n\n```bash\n$ ./gradlew build\n```\n\n## Build Docker Image\n\n```bash\n$ ./gradlew buildDockerImage\n```\n\n## Install Helm Chart\n\n```bash\n$ ./gradlew spark-operator-api:relocateGeneratedCRD\n\n$ helm install spark-kubernetes-operator --create-namespace -f build-tools/helm/spark-kubernetes-operator/values.yaml build-tools/helm/spark-kubernetes-operator/\n```\n\n## Run Spark Pi App\n\n```bash\n$ kubectl apply -f examples/pi.yaml\n\n$ kubectl get sparkapp\nNAME   CURRENT STATE      AGE\npi     ResourceReleased   4m10s\n\n$ kubectl delete sparkapp/pi\n```\n\n## Run Spark Cluster\n\n```bash\n$ kubectl apply -f examples/prod-cluster-with-three-workers.yaml\n\n$ kubectl get sparkcluster\nNAME   CURRENT STATE    AGE\nprod   RunningHealthy   10s\n\n$ kubectl port-forward prod-master-0 6066 \u0026\n\n$ ./examples/submit-pi-to-prod.sh\n{\n  \"action\" : \"CreateSubmissionResponse\",\n  \"message\" : \"Driver successfully submitted as driver-20240821181327-0000\",\n  \"serverSparkVersion\" : \"4.0.0-preview2\",\n  \"submissionId\" : \"driver-20240821181327-0000\",\n  \"success\" : true\n}\n\n$ curl http://localhost:6066/v1/submissions/status/driver-20240821181327-0000/\n{\n  \"action\" : \"SubmissionStatusResponse\",\n  \"driverState\" : \"FINISHED\",\n  \"serverSparkVersion\" : \"4.0.0-preview2\",\n  \"submissionId\" : \"driver-20240821181327-0000\",\n  \"success\" : true,\n  \"workerHostPort\" : \"10.1.5.188:42099\",\n  \"workerId\" : \"worker-20240821181236-10.1.5.188-42099\"\n}\n\n$ kubectl delete sparkcluster prod\nsparkcluster.spark.apache.org \"prod\" deleted\n```\n\n## Run Spark Pi App on Apache YuniKorn scheduler\n\nIf you have not yet done so, follow [YuniKorn docs](https://yunikorn.apache.org/docs/#install) to install the latest version: \n\n```bash\n$ helm repo add yunikorn https://apache.github.io/yunikorn-release\n\n$ helm repo update\n\n$ helm install yunikorn yunikorn/yunikorn --namespace yunikorn --version 1.6.0 --create-namespace --set embedAdmissionController=false\n```\n\nSubmit a Spark app to YuniKorn enabled cluster:\n\n```bash\n\n$ kubectl apply -f examples/pi-on-yunikorn.yaml\n\n$ kubectl describe pod pi-on-yunikorn-0-driver\n...\nEvents:\n  Type    Reason             Age   From      Message\n  ----    ------             ----  ----      -------\n  Normal  Scheduling         14s   yunikorn  default/pi-on-yunikorn-0-driver is queued and waiting for allocation\n  Normal  Scheduled          14s   yunikorn  Successfully assigned default/pi-on-yunikorn-0-driver to node docker-desktop\n  Normal  PodBindSuccessful  14s   yunikorn  Pod default/pi-on-yunikorn-0-driver is successfully bound to node docker-desktop\n  Normal  TaskCompleted      6s    yunikorn  Task default/pi-on-yunikorn-0-driver is completed\n  Normal  Pulled             13s   kubelet   Container image \"apache/spark:4.0.0-preview2\" already present on machine\n  Normal  Created            13s   kubelet   Created container spark-kubernetes-driver\n  Normal  Started            13s   kubelet   Started container spark-kubernetes-driver\n\n$ kubectl delete sparkapp pi-on-yunikorn\nsparkapplication.spark.apache.org \"pi-on-yunikorn\" deleted\n```\n\n## Try nightly build for testing\n\nAs of now, you can try `spark-kubernetes-operator` nightly version in the following way.\n\n```\n$ helm install spark-kubernetes-operator \\\nhttps://nightlies.apache.org/spark/charts/spark-kubernetes-operator-0.1.0-SNAPSHOT.tgz\n```\n\n## Clean Up\n\nCheck the existing Spark applications and clusters. If exists, delete them.\n\n```\n$ kubectl get sparkapp\nNo resources found in default namespace.\n\n$ kubectl get sparkcluster\nNo resources found in default namespace.\n```\n\nRemove HelmChart and CRDs.\n\n```\n$ helm uninstall spark-kubernetes-operator\n\n$ kubectl delete crd sparkapplications.spark.apache.org\n\n$ kubectl delete crd sparkclusters.spark.apache.org\n```\n\nIn case of nightly builds, remove the snapshot image.\n```\n$ docker rmi apache/spark-kubernetes-operator:main-snapshot\n```\n\n## Contributing\n\nPlease review the [Contribution to Spark guide](https://spark.apache.org/contributing.html)\nfor information on how to get started contributing to the project.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fspark-kubernetes-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fspark-kubernetes-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fspark-kubernetes-operator/lists"}