https://github.com/geraked/bigdata

Implementation of Big Data Analytics Algorithms in Python
https://github.com/geraked/bigdata

amirkabir-university association-rules big-data big-data-analytics bigdata collaborative-filtering cs246 data-mining data-science frequent-itemset-mining friendship-algorithm geraked graph kmeans-clustering locality-sensitive-hashing rabist recommender-system stanford-course stream-processing triangle-counting

Last synced: 30 days ago
JSON representation

Implementation of Big Data Analytics Algorithms in Python

Host: GitHub
URL: https://github.com/geraked/bigdata
Owner: geraked
License: mit
Created: 2022-09-20T10:30:31.000Z (about 3 years ago)
Default Branch: master
Last Pushed: 2022-09-20T10:45:38.000Z (about 3 years ago)
Last Synced: 2025-02-22T19:13:52.468Z (8 months ago)
Topics: amirkabir-university, association-rules, big-data, big-data-analytics, bigdata, collaborative-filtering, cs246, data-mining, data-science, frequent-itemset-mining, friendship-algorithm, geraked, graph, kmeans-clustering, locality-sensitive-hashing, rabist, recommender-system, stanford-course, stream-processing, triangle-counting
Language: Jupyter Notebook
Homepage:
Size: 11 MB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Big Data Analytics

Implementation of Some of the Big Data Analytics Algorithms in Python

| # | Title | Description |

| --- | --- | --- |

| 1 | [Friendship Recommendation](hw1/q1.ipynb) | Suggest new friends to individual users based on their mutual friends using PySpark. |

| 2 | [Association Rules](hw1/q2.ipynb) | Implementation of A-priori algorithm for frequent item set mining and association rule learning. |

| 3 | [Locality-sensitive Hashing](hw1/q3.ipynb) | Implementation of LSH algorithmic technique that hashes similar input items into the same buckets with high probability. |

| 4 | [DGIM Algorithm](hw2/q1.ipynb) | DGIM algorithm implementation to find the number 1's in a dataset. |

| 5 | [Recommender System](hw2/q2.ipynb) | Item-based and user-based collaborative filtering using PySpark. |

| 6 | [k-means](hw2/q3.ipynb) | k-means clustering algorithm. |

| 7 | [Triangle Counting](final-project/code.ipynb) | Implementations of the algorithms for the adjacency list model in the experiments in the paper "Triangle and Four Cycle Counting with Predictions in Graph Streams". |

## Author

**Rabist** - view on [LinkedIn](https://www.linkedin.com/in/rabist)

## Details

- **Course:** Advanced Topics (Big Data Analytics) - MS

- **Teacher:** [Dr. Mostafa HaghirChehreghani](https://aut.ac.ir/cv/2350/Mostafa%20HaghirChehreghani)

- **Univ:** Amirkabir University of Technology

- **Semester:** Spring 2022

## License

Licensed under [MIT](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/geraked/bigdata

Awesome Lists containing this project

README