Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/viaduct-ai/docker-spark-k8s-aws
Docker image for running Spark 3 on Kubernetes on AWS
https://github.com/viaduct-ai/docker-spark-k8s-aws
Last synced: 10 days ago
JSON representation
Docker image for running Spark 3 on Kubernetes on AWS
- Host: GitHub
- URL: https://github.com/viaduct-ai/docker-spark-k8s-aws
- Owner: viaduct-ai
- Created: 2021-05-25T00:26:09.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-05-26T18:34:16.000Z (over 3 years ago)
- Last Synced: 2024-08-02T14:06:10.819Z (3 months ago)
- Size: 16.6 KB
- Stars: 26
- Watchers: 4
- Forks: 14
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# spark-k8s-aws
Build for an Apache Spark on kubernetes-ready docker image configured with notable AWS Dependencies, including:
* An up-to-date AWS SDK capable of supporting [ IRSA ](https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/)
* [ AWS Glue Data Catalog client for Hive Metastore ](https://github.com/viaduct-ai/aws-glue-data-catalog-client-for-apache-hive-metastore)## Build the Docker Image
Builds are managed using https://earthly.dev```
earthly --use-inline-cache +build-spark-image
```Use in your own Earthfile build:
```
my-image:
FROM +github.com/viaduct-ai/docker-spark-k8s-aws+build-spark-image
# ...
```## Why?
If you've ever tried building a spark distribution/image with the AWS Glue Data
Catalog Client for Hive, you know it's a PITA.This project aims to open source a working docker image, built using the
amazing [ Earthly ](https://earthly.dev) tool, to democratize a more integrated
Apache Spark on Kubernetes on AWS experience until someone develops a Spark
DataSourceV2 API-compliant Glue Data Catalog implementation (instead of
this absolute hack of patching hive and building spark from source)Many thanks to @bbenzikry for open sourcing their solution to build Spark 3 +
Glue compatible docker images. This project builds on their work.