https://github.com/vaibhavbansal26/data-pipeline-with-airflow
Data Pipeline With Airflow, AWS
https://github.com/vaibhavbansal26/data-pipeline-with-airflow
Last synced: about 2 months ago
JSON representation
Data Pipeline With Airflow, AWS
- Host: GitHub
- URL: https://github.com/vaibhavbansal26/data-pipeline-with-airflow
- Owner: VaibhavBansal26
- Created: 2024-07-14T23:55:07.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-07-15T23:21:08.000Z (11 months ago)
- Last Synced: 2024-07-17T02:45:59.077Z (11 months ago)
- Language: Python
- Size: 62.5 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data-Pipeline-With-Airflow
Data Pipeline With Airflow, AWSA music streaming company, Sparkify, has decided that it is time to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow.
In this project, Need to create custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step.

Main file -> Airflow -> dags
Other file -> Airflow -> Plugins -> 1. helpers
2. operators- Copying Data from S3
- Configuring Redshift
- Configuring Airflow
- Setting Up Airflow Connections with AWS
- Creating the Dag