https://github.com/underpostnet/spark-template.g8
Spark template runner suite
https://github.com/underpostnet/spark-template.g8
gpu-acceleration kubeflow ml-ops spark
Last synced: about 15 hours ago
JSON representation
Spark template runner suite
- Host: GitHub
- URL: https://github.com/underpostnet/spark-template.g8
- Owner: underpostnet
- License: cc0-1.0
- Created: 2025-06-26T06:15:10.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2025-11-04T20:32:13.000Z (8 months ago)
- Last Synced: 2025-11-04T22:19:21.352Z (8 months ago)
- Topics: gpu-acceleration, kubeflow, ml-ops, spark
- Language: Scala
- Homepage: https://www.nexodev.org
- Size: 90.8 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
## Spark on Kubernetes Template

[](https://github.com/underpostnet/spark-template-demo/actions/workflows/docker-image.yml)
This project provides template for building, testing, and deploying Scala Spark applications on Kubernetes using the Spark Operator.
### Core Features
- **Scala & sbt**: Leverages `sbt` for a standard Scala project structure, dependency management, and build automation.
- **Containerized Workflows**: A multi-stage `Dockerfile` ensures a lean and optimized Docker image for your Spark application, based on official Apache Spark images.
- **Kubernetes-Native Deployment**: Designed for seamless deployment and management via the Spark on Kubernetes Operator, utilizing `SparkApplication` custom resources.
- **Integrated Testing**:
- Supports unit and integration testing with `scalatest` and `spark-fast-tests`.
- Includes a `TestRunner` that enables **in-cluster testing**, allowing your test suites to execute directly on the Spark cluster in an environment identical to production.
- **GPU Acceleration Ready**: Pre-configured with RAPIDS Accelerator for Apache Spark, enabling GPU-accelerated Spark SQL and DataFrame operations.
- **Pre-configured RBAC**: Includes necessary Kubernetes Role-Based Access Control (`spark-rbac.yaml`) to grant the Spark driver the permissions required to create and manage its executor pods.
- **Ephemeral Storage Configuration**: Demonstrates how to configure ephemeral storage requests and limits for Spark driver and executor pods.
- **ConfigMap Integration**: Utilizes Kubernetes ConfigMaps (`spark-driver-pod-config`, `spark-executor-pod-config`) to inject pod spec fragments, allowing for flexible and advanced pod configurations.
### How to Use This Template
This project is designed as a [Giter8 template](https://www.foundweekends.org/giter8/index.html), making it easy to generate new Spark projects with all the pre-configured settings.
#### 1. Install sbt
If you don't have sbt installed, follow the instructions on the [official sbt website](https://www.scala-sbt.org/download.html). Ensure you have sbt launcher version 0.13.13 or above.
#### 2. Create a New Project from the Template
Open your terminal and run the following command.
```bash
sbt new underpostnet/spark-template.g8
```