https://github.com/simple-dev-tools/spark-scaffold
A framework to develop production grade Spark jobs
https://github.com/simple-dev-tools/spark-scaffold
Last synced: about 1 year ago
JSON representation
A framework to develop production grade Spark jobs
- Host: GitHub
- URL: https://github.com/simple-dev-tools/spark-scaffold
- Owner: simple-dev-tools
- Created: 2020-04-25T07:44:37.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-05-23T09:38:11.000Z (about 6 years ago)
- Last Synced: 2025-02-15T11:48:44.685Z (over 1 year ago)
- Language: Scala
- Homepage:
- Size: 42 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Spark Scaffold

A framework to develop production grade Spark jobs.
## The Idea
The key concept of this framework is the `Run Context`, which made of three major components,
1. The Spark session - which you have to use it for Transformation and Action
2. The Parameters (parsed arguments from command input) - which from the `spark-submit` command
3. The config - which should have different config for different environments.
This framework helps you to manage those three components and enable you focus on the actual business logic - DataFrame
transformation.
## Quick Start
To build and run test cases, just simply run,
```bash
sbt test
```
To package the Jar run,
```bash
make build-spark-jar
```
> The `sbt-assembly` plugin to build a fat Jar with dependencies
To run the example Spark jobs as local mode, just run,
```bash
make submit-job
```
## Use it in your project