https://github.com/humairarizwan/creating-streaming-data-pipelines-using-kafka
https://github.com/humairarizwan/creating-streaming-data-pipelines-using-kafka
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/humairarizwan/creating-streaming-data-pipelines-using-kafka
- Owner: HumairaRizwan
- Created: 2024-06-01T10:54:53.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-02T05:16:06.000Z (about 1 year ago)
- Last Synced: 2025-02-04T18:40:51.067Z (5 months ago)
- Language: Python
- Size: 1.32 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Creating-Streaming-Data-Pipelines-using-Kafka
## Scenario
You are a data engineer at a data analytics consulting company. You have been assigned to a project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. As a vehicle passes a toll plaza, the vehicle’s data like vehicle_id,vehicle_type,toll_plaza_id and timestamp are streamed to Kafka. Your job is to create a data pipe line that collects the streaming data and loads it into a database.## Objectives
create a streaming data pipe by performing these steps:
- Start a MySQL Database server.
- Create a table to hold the toll data.
- Start the Kafka server.
- Install the Kafka Python driver.
- Install the MySQL Python driver.
- Create a topic named
toll
in Kafka. - Download the streaming data generator program.
- Customize the generator program to stream data to the
toll
topic. - Download and customize the streaming data consumer.
- Customize the consumer program to write data into the MySQL database table.
- Verify that the streamed data is being collected in the database table.