https://github.com/newfront/docker-spark-base
Creates a customizable base image for working with Apache Spark
https://github.com/newfront/docker-spark-base
Last synced: 4 months ago
JSON representation
Creates a customizable base image for working with Apache Spark
- Host: GitHub
- URL: https://github.com/newfront/docker-spark-base
- Owner: newfront
- License: apache-2.0
- Created: 2021-07-10T21:50:30.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-11-09T01:56:52.000Z (over 4 years ago)
- Last Synced: 2025-01-29T23:26:43.257Z (over 1 year ago)
- Language: Dockerfile
- Size: 7.81 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# docker-spark-base
Creates a customizable base image for working with Apache Spark
## Build Phases
### Download Source Phase
- alpine linux - no-cache
* step 1. Downloads the tar from the official tagged spark release in github
* step 2. untar and clean up
### Maven Building and Packaging Phase
- mvn + jdk11
* step 1. mvn package phase (takes a while, but compiles all the spark packages cleanly)
### Final Image Phase
- openjdk:11-jre-slim
This is the final Spark image. It uses the debian slim buster linux image.
~~~
export SPARK_VERSION=3.2.0
export SPARK_USER=500
docker build . \
--build-arg spark_version=${SPARK_VERSION} \
--build-arg spark_user=${SPARK_USER} \
--tag `whoami`/docker-spark-base:${SPARK_VERSION}
~~~