Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/piero24/big-data_hw_23-24
Exercises in Java and Spark for the Big Data Computing course at unipd
https://github.com/piero24/big-data_hw_23-24
big-data clustering fft java mapreduce sampling spark streaming
Last synced: about 1 month ago
JSON representation
Exercises in Java and Spark for the Big Data Computing course at unipd
- Host: GitHub
- URL: https://github.com/piero24/big-data_hw_23-24
- Owner: Piero24
- Created: 2024-03-23T20:56:02.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-06-07T20:12:44.000Z (8 months ago)
- Last Synced: 2025-01-01T11:13:28.482Z (about 2 months ago)
- Topics: big-data, clustering, fft, java, mapreduce, sampling, spark, streaming
- Language: Java
- Homepage:
- Size: 14.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![Last modified](https://img.shields.io/badge/Last%20modified-10--Aug--2021-red)](https://github.com/Piero24/Big-Data_HW_23-24)
# Big-Data_HW_23-24> academic year 2023-2024 (unipd)
>
> University of Padua---
## Java and Spark programming exercises for the Big Data course
Homework assigned by the teacher to develop a minimum of skills in Java and Spark and learn the basics of Big Data.
Here is a collection with related solutions.
The test consists in solving the listed exercises.
## Disclaimer
These exercises should ONLY be used for practicing.
**I AM IN NO WAY RESPONSIBLE FOR MISUSE OF THIS MATERIAL.**
**DO NOT** rely solely on the following exercises for preparation.
As the course program may vary over the years.
Use this material only and exclusively for practice.## Description
There are 3 different exercises.
- The first exercise is about the MapReduce programming model.
- The second exercise is about the Clustering with the Farthest First Traversal algorithm and MapReduce in a sequential way.
- The third exercise is about the Streaming with the implementation of the Steaky Sampling and Reservoir Sampling algorithms to detect frequent items inside a stream.### Authors and Copyright
[Pietrobon Andrea](https://github.com/Piero24), [Friso Giovanni](https://github.com/GioFriso), [Agostini Francesco](https://github.com/FrancescoAgostiniUnipd)
### Note
This material will **NOT** be updated in the future.