https://github.com/astrolabsoftware/k8s-spark-py
Build spark-py image for Kubernetes
https://github.com/astrolabsoftware/k8s-spark-py
Last synced: 7 days ago
JSON representation
Build spark-py image for Kubernetes
- Host: GitHub
- URL: https://github.com/astrolabsoftware/k8s-spark-py
- Owner: astrolabsoftware
- License: gpl-3.0
- Created: 2023-01-13T16:18:51.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-12-11T14:22:31.000Z (over 2 years ago)
- Last Synced: 2025-02-28T07:49:15.247Z (over 1 year ago)
- Language: Shell
- Size: 29.7 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.adoc
- License: LICENSE
Awesome Lists containing this project
README
# k8s-spark-py
Build spark-py image for Kubernetes
# Pre-requisites
```shell
# Retrieve the code
git clone https://github.com/astrolabsoftware/k8s-spark-py
cd k8s-spark-py
```
Eventually edit `conf.sh` to fine-tune the configuration
```
# Download and unzip Spark binaries
./prereq-install.sh
```
# Build spark-py image for k8s
```shell
./build.sh
```
# Push image to IN2P3 registry
```shell
# Log in IN2P3 registry
docker login gitlab-registry.in2p3.fr
./push-image.sh
```
# Customize image
The goal is to remain closest from the standard build procedure documented here:
https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images
However, it is possible to customize the build by adding files inside the `custom/` directory. This file will be copied to SPARK_HOME just before building the container images. Currenly `avro`, `hbase` and `kafka` java libraries are added in order to support https://github.com/astrolabsoftware/fink-broker[fink-broker].
# Automated build
The CI will automatically build and push `spark-py` container image inside IN2P3 registry for each commit to the git repository.