Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yosrak5/data-streaming

This project involves the development of a robust data engineering pipeline that orchestrates the seamless ingestion, processing, and storage of data .
https://github.com/yosrak5/data-streaming

airflow-dags apache cassandra docker etl kafka python spark

Last synced: about 2 months ago
JSON representation

This project involves the development of a robust data engineering pipeline that orchestrates the seamless ingestion, processing, and storage of data .

Awesome Lists containing this project

README

        

# Data-Streaming

This project involves the development of a robust data engineering pipeline that orchestrates the seamless ingestion, processing, and storage of data. The pipeline is built using a combination of Python, Apache Airflow for workflow automation, and Apache Kafka for real-time data streaming. It follows a comprehensive ETL (Extract, Transform, Load) process and leverages Cassandra as a distributed database for scalable data storage. All components are containerized using Docker, ensuring easy deployment and scalability across environments.