An open API service indexing awesome lists of open source software.

https://github.com/humairarizwan/creating-streaming-data-pipelines-using-kafka


https://github.com/humairarizwan/creating-streaming-data-pipelines-using-kafka

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

        

# Creating-Streaming-Data-Pipelines-using-Kafka
## Scenario
You are a data engineer at a data analytics consulting company. You have been assigned to a project that aims to de-congest the national highways by analyzing the road traffic data from different toll plazas. As a vehicle passes a toll plaza, the vehicle’s data like vehicle_id,vehicle_type,toll_plaza_id and timestamp are streamed to Kafka. Your job is to create a data pipe line that collects the streaming data and loads it into a database.

## Objectives
create a streaming data pipe by performing these steps:


  • Start a MySQL Database server.

  • Create a table to hold the toll data.

  • Start the Kafka server.

  • Install the Kafka Python driver.

  • Install the MySQL Python driver.

  • Create a topic named toll in Kafka.

  • Download the streaming data generator program.

  • Customize the generator program to stream data to the toll topic.

  • Download and customize the streaming data consumer.

  • Customize the consumer program to write data into the MySQL database table.

  • Verify that the streamed data is being collected in the database table.