Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/undisputed-jay/airflow-etl-pipeline-with-pyspark-and-google-cloud-dataproc

This project automates daily vehicle data processing on Google Cloud using Apache Airflow. It uploads scripts to Google Cloud Storage, runs specific PySpark jobs on Dataproc based on the day, and shuts down resources when done for efficiency.
https://github.com/undisputed-jay/airflow-etl-pipeline-with-pyspark-and-google-cloud-dataproc

automated-etl-airflow-dataproc cost-effective-data-processing daily-data-analysis-airflow-pyspark

Last synced: about 14 hours ago
JSON representation

This project automates daily vehicle data processing on Google Cloud using Apache Airflow. It uploads scripts to Google Cloud Storage, runs specific PySpark jobs on Dataproc based on the day, and shuts down resources when done for efficiency.

Awesome Lists containing this project