https://github.com/geraked/bigdata
Implementation of Big Data Analytics Algorithms in Python
https://github.com/geraked/bigdata
amirkabir-university association-rules big-data big-data-analytics bigdata collaborative-filtering cs246 data-mining data-science frequent-itemset-mining friendship-algorithm geraked graph kmeans-clustering locality-sensitive-hashing rabist recommender-system stanford-course stream-processing triangle-counting
Last synced: 4 months ago
JSON representation
Implementation of Big Data Analytics Algorithms in Python
- Host: GitHub
- URL: https://github.com/geraked/bigdata
- Owner: geraked
- License: mit
- Created: 2022-09-20T10:30:31.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2022-09-20T10:45:38.000Z (almost 3 years ago)
- Last Synced: 2025-01-03T14:50:31.319Z (6 months ago)
- Topics: amirkabir-university, association-rules, big-data, big-data-analytics, bigdata, collaborative-filtering, cs246, data-mining, data-science, frequent-itemset-mining, friendship-algorithm, geraked, graph, kmeans-clustering, locality-sensitive-hashing, rabist, recommender-system, stanford-course, stream-processing, triangle-counting
- Language: Jupyter Notebook
- Homepage:
- Size: 11 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Big Data Analytics
Implementation of Some of the Big Data Analytics Algorithms in Python
| # | Title | Description |
| --- | --- | --- |
| 1 | [Friendship Recommendation](hw1/q1.ipynb) | Suggest new friends to individual users based on their mutual friends using PySpark. |
| 2 | [Association Rules](hw1/q2.ipynb) | Implementation of A-priori algorithm for frequent item set mining and association rule learning. |
| 3 | [Locality-sensitive Hashing](hw1/q3.ipynb) | Implementation of LSH algorithmic technique that hashes similar input items into the same buckets with high probability. |
| 4 | [DGIM Algorithm](hw2/q1.ipynb) | DGIM algorithm implementation to find the number 1's in a dataset. |
| 5 | [Recommender System](hw2/q2.ipynb) | Item-based and user-based collaborative filtering using PySpark. |
| 6 | [k-means](hw2/q3.ipynb) | k-means clustering algorithm. |
| 7 | [Triangle Counting](final-project/code.ipynb) | Implementations of the algorithms for the adjacency list model in the experiments in the paper "Triangle and Four Cycle Counting with Predictions in Graph Streams". |## Author
**Rabist** - view on [LinkedIn](https://www.linkedin.com/in/rabist)## Details
- **Course:** Advanced Topics (Big Data Analytics) - MS
- **Teacher:** [Dr. Mostafa HaghirChehreghani](https://aut.ac.ir/cv/2350/Mostafa%20HaghirChehreghani)
- **Univ:** Amirkabir University of Technology
- **Semester:** Spring 2022## License
Licensed under [MIT](LICENSE).