An open API service indexing awesome lists of open source software.

https://github.com/mrcolorr/supreme-pancake

Big Data Management project: The collection of data from a network of sensors was simulated (kafka), which then had to be processed (spark) and stored (cassandraDB) in a distributed and efficient way.
https://github.com/mrcolorr/supreme-pancake

big-data bigdata cassandra cassandra-cluster cassandra-database cloud cloud-computing distributed-computing distributed-database distributed-storage distributed-systems hdfs kafka maven maven-pom spark zerotier zerotier-network zerotier-one

Last synced: 15 days ago
JSON representation

Big Data Management project: The collection of data from a network of sensors was simulated (kafka), which then had to be processed (spark) and stored (cassandraDB) in a distributed and efficient way.

Awesome Lists containing this project

README

        

# supreme-pancake
Repo for Big Data Management project

Three components were created in this project, a producer / data collector (kafka), a distributed database (CassandraDB) and a consumer / data processor (Spark).
The collection of data from a network of sensors was simulated, which then had to be processed and stored in a distributed and efficient way. The data collected (or generated) by kafka were then processed by spark and saved for long-term archiving on cassanda db.
The connection between the PCs has been made simple and scalable using Zerotier.

- [x] Leave a star ⭐ if you like this project 🙂 thank you.

## What's inside
- Kafka module
- Cassanda db module
- Spark module
- Data cleaning scripts
- Distributed job start and stop scripts
- Project runme script
- Project document with details