Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/humairarizwan/creating-streaming-data-pipelines-using-airflow
https://github.com/humairarizwan/creating-streaming-data-pipelines-using-airflow
Last synced: 13 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/humairarizwan/creating-streaming-data-pipelines-using-airflow
- Owner: HumairaRizwan
- Created: 2024-06-01T20:19:51.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-06-02T05:59:17.000Z (7 months ago)
- Last Synced: 2024-06-02T22:31:19.079Z (7 months ago)
- Language: Python
- Size: 1.8 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Creating-Streaming-Data-Pipelines-using-Airflow
## Scenario
A project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. Each highway is operated by a different toll operator with a different IT setup that uses different file formats. Task is to collect data available in different formats and consolidate it into a single file.## Objectives
In this assignment we will author an Apache Airflow DAG that will:
- Extract data from a CSV file
- Extract data from a TSV file
- Extract data from a fixed width file
- Transform the data
- Load the transformed data into the staging area
## Description
This project is described in detail in a Medium. The article provides a comprehensive explanation of the code.
Read the article on Medium: