An open API service indexing awesome lists of open source software.

https://github.com/garystafford/dataproc-workflow-templates

Demonstration of Google Cloud Dataproc Workflow Templates
https://github.com/garystafford/dataproc-workflow-templates

dataproc gcp google-cloud-platform hadoop pyspark spark

Last synced: 3 months ago
JSON representation

Demonstration of Google Cloud Dataproc Workflow Templates

Awesome Lists containing this project

README

          

# Google Cloud Dataproc WorkflowTemplates API Demo

Code repository for post, [Using the Google Cloud Dataproc WorkflowTemplates API to Automate Spark and Hadoop Workloads on GCP](https://programmaticponderings.com/).

## Files
* `template-demo-2.yaml`: Non-parametrized version of workflow template with three jobs, using a managed 3-node Spark cluster
* `template-demo-3.yaml`: Parametrized version of workflow template with one Python-based PySpark job, using a managed 3-node Spark cluster
* `template-demo-4.yaml`: Parametrized version of workflow template with one Python-based PySpark job, using an existing 3-node Spark cluster
* `template-demo-5.yaml`: Parametrized version of workflow template with one Java-based Spark job, using an existing 3-node Spark cluster