Projects in Awesome Lists tagged with distributed-processing
A curated list of projects in awesome lists tagged with distributed-processing .
https://github.com/stefanofioravanzo/evolving-wikipedia-graph
Distributed processing of Wikipedia history files using Hadoop and Spark
distributed-processing hadoop-hdfs spark wikipedia
Last synced: 12 Mar 2025
https://github.com/lilivalgo/smartcity
Simulates a real-time Smart City data pipeline with Kafka, Apache Spark, and S3. Streams and processes vehicle, GPS, weather, traffic, and emergency data with Dockerized components and Parquet storage for efficient, scalable data engineering
apache-spark aws-s3 data-pipeline distributed-processing docker parquet-storage real-time-streaming
Last synced: 04 Feb 2026
https://github.com/mohithavelagapudi/imdb-movie-dataset-analysis-recommendation-system
An end-to-end data engineering and analysis project to process a large-scale movie dataset, derive actionable business insights using Apache Spark, and build a content-based recommendation system.
apache-spark distributed-processing exploratory-data-analysis movie-recomendation-system scala streamlit
Last synced: 02 Nov 2025
https://github.com/leonardoguths/ippd
Repository containing codes and stuff from the Introduction to parallel and distributed processing class.
c distributed-processing mpi openmp parallel-processing
Last synced: 14 May 2025
https://github.com/yousefmohammad/java_chatapp
distributed and centered Java chat app
distributed-processing distributed-systems java swing
Last synced: 30 Nov 2025