Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/manoharvit/ecommerce-dive-deep-sales-analysis
In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.
https://github.com/manoharvit/ecommerce-dive-deep-sales-analysis
airflow airflow-dags ecommerce elt pyspark s3 s3-bucket spark sql
Last synced: 3 months ago
JSON representation
In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.
- Host: GitHub
- URL: https://github.com/manoharvit/ecommerce-dive-deep-sales-analysis
- Owner: ManoharVit
- Created: 2024-07-16T23:51:25.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-07-25T14:44:07.000Z (5 months ago)
- Last Synced: 2024-10-12T00:03:48.960Z (3 months ago)
- Topics: airflow, airflow-dags, ecommerce, elt, pyspark, s3, s3-bucket, spark, sql
- Language: Jupyter Notebook
- Homepage:
- Size: 134 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Security: SECURITY.md