https://github.com/mrcolorr/supreme-pancake
Big Data Management project: The collection of data from a network of sensors was simulated (kafka), which then had to be processed (spark) and stored (cassandraDB) in a distributed and efficient way.
https://github.com/mrcolorr/supreme-pancake
big-data bigdata cassandra cassandra-cluster cassandra-database cloud cloud-computing distributed-computing distributed-database distributed-storage distributed-systems hdfs kafka maven maven-pom spark zerotier zerotier-network zerotier-one
Last synced: 15 days ago
JSON representation
Big Data Management project: The collection of data from a network of sensors was simulated (kafka), which then had to be processed (spark) and stored (cassandraDB) in a distributed and efficient way.
- Host: GitHub
- URL: https://github.com/mrcolorr/supreme-pancake
- Owner: MRColorR
- Created: 2021-09-24T13:45:10.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-05-24T16:08:56.000Z (almost 2 years ago)
- Last Synced: 2025-04-09T12:21:56.613Z (about 1 month ago)
- Topics: big-data, bigdata, cassandra, cassandra-cluster, cassandra-database, cloud, cloud-computing, distributed-computing, distributed-database, distributed-storage, distributed-systems, hdfs, kafka, maven, maven-pom, spark, zerotier, zerotier-network, zerotier-one
- Language: Java
- Homepage:
- Size: 3.59 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# supreme-pancake
Repo for Big Data Management projectThree components were created in this project, a producer / data collector (kafka), a distributed database (CassandraDB) and a consumer / data processor (Spark).
The collection of data from a network of sensors was simulated, which then had to be processed and stored in a distributed and efficient way. The data collected (or generated) by kafka were then processed by spark and saved for long-term archiving on cassanda db.
The connection between the PCs has been made simple and scalable using Zerotier.- [x] Leave a star ⭐ if you like this project 🙂 thank you.
## What's inside
- Kafka module
- Cassanda db module
- Spark module
- Data cleaning scripts
- Distributed job start and stop scripts
- Project runme script
- Project document with details