https://github.com/sskender/analysis-of-massive-datasets
Analysis of Massive Datasets FER labs
https://github.com/sskender/analysis-of-massive-datasets
big-data data-flow data-flows frequency-analysis graph-algorithms graph-theory map-reduce mapreduce minhash node-ranking page-rank page-ranking recommendation-system recommender-system simhash similarity-search
Last synced: 10 months ago
JSON representation
Analysis of Massive Datasets FER labs
- Host: GitHub
- URL: https://github.com/sskender/analysis-of-massive-datasets
- Owner: sskender
- Created: 2022-03-08T22:43:20.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2022-06-10T08:55:27.000Z (almost 4 years ago)
- Last Synced: 2025-07-06T21:43:42.037Z (11 months ago)
- Topics: big-data, data-flow, data-flows, frequency-analysis, graph-algorithms, graph-theory, map-reduce, mapreduce, minhash, node-ranking, page-rank, page-ranking, recommendation-system, recommender-system, simhash, similarity-search
- Language: Python
- Homepage:
- Size: 19 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Analysis of Massive Datasets
Analysis of Massive Datasets FER labs
## Course Description
An introduction to the analysis of large datasets. MapReduce software model. Finding similar entities. Data Flow Analysis. Analysis of links in data presented by graphs. Finding frequent gatherings. Finding groups in large datasets. Recommendation systems. Social Network Graph Analysis. Web Advertising Models. Dimensionality reduction. Machine learning with proportional growth.
### Learning Outcomes:
- identify and understand why a problem belongs to the Big Data category
- apply the MapReduce programming model when encountering certain types of problems
- design and evaluate a system for finding similar entities in a large data set
- design and evaluate a system for finding frequent sets in a large data set
- design and evaluate a node ranking system for a very large data set represented by a graph
- design and evaluate a recommendation system
- apply appropriate algorithms to find groups in a large set of falls
- apply appropriate algorithms to process data flows