https://github.com/pngo1997/hyperlink-analysis-content-based-recommendation-system
Hyperlink Analysis using PageRank and HITS algorithms and simple Content-Based Recommendation System.
https://github.com/pngo1997/hyperlink-analysis-content-based-recommendation-system
content-based-recommendation hits-algorithm knn link-analysis pagerank python
Last synced: 14 days ago
JSON representation
Hyperlink Analysis using PageRank and HITS algorithms and simple Content-Based Recommendation System.
- Host: GitHub
- URL: https://github.com/pngo1997/hyperlink-analysis-content-based-recommendation-system
- Owner: pngo1997
- Created: 2025-01-31T19:40:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-31T19:45:38.000Z (over 1 year ago)
- Last Synced: 2025-08-02T06:36:53.998Z (10 months ago)
- Topics: content-based-recommendation, hits-algorithm, knn, link-analysis, pagerank, python
- Language: Jupyter Notebook
- Homepage:
- Size: 230 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🔗 Hyperlink Analysis & Content-Based Recommendation System
## 📜 Overview
This project is divided into two main components:
1. **Hyperlink Analysis** using **PageRank and HITS algorithms**.
2. **Content-Based Recommendation System** using **K-Nearest Neighbors (KNN)** for song recommendations based on Spotify metadata.
📌 **Datasets Used**:
- **Hyperlink Analysis**: Simulated Web graph structure.
- **Content-Based Recommendation**:
- `Spotify_Metadata.csv` – Song metadata.
- `Spotify_Numeric.csv` – Song features (numeric vectors).
- `Spotify_Users.txt` – User playlists.
📌 **Programming Language**: `Python 3`
📌 **Libraries Used**: `NumPy`, `pandas`, `networkx`, `scikit-learn`, `math`, `csv`
## 🔗 1️⃣ Hyperlink Analysis
### **1.1 PageRank Algorithm**
- **Input**: Web graph structure: A → {B, C} B → {C} C → {A}
- **Computation**:
- **Damping Factor (α) = 0.15**
- **Three Iterations**
- PageRank scores computed **before & after normalization** at each step.
### **1.2 HITS Algorithm (Hubs & Authorities)**
- **Input**: Web graph structure (including Page D).
- **Computation**
- **Three Iterations**
- Hub & Authority scores computed **before & after normalization** at each step.
📌 **Output**:
- PageRank & HITS scores printed at each iteration.
## 🎵 2️⃣ Content-Based Recommendation System
### **2.1 K-Nearest Neighbors (KNN) for Song Similarity**
- Function: `knn(item, Data, K)`
- **Input**:
- `item` – Feature vector of a song.
- `Data` – Song feature matrix (`Spotify_Numeric.csv`).
- `K` – Number of nearest neighbors.
- **Computation**:
- Compute **Cosine Similarity** between `item` and all songs in `Data`.
- Return **K most similar songs** with metadata (`Spotify_Metadata.csv`).
📌 **Demonstration**:
- Run `knn()` for the given song vector: item = [0.8, 0.1, 0.7, 0.0, 0.5, 0.1, 0.5, 0.2, 0.0, 1.0, 0.1, 0.2, 0.7]
- Print **top 10 similar songs**.
### **2.2 User-Based Song Recommendations**
- Function: `recommend(userID, Data, K)`
- **Input**:
- `userID` – User index (0-based).
- `Data` – Song feature matrix.
- `K` – Number of recommendations.
- **Computation**:
- Compute **user profile** as the **centroid vector** of liked songs.
- Call `knn()` to get **top K song recommendations**.
- **Exclude previously listened songs** from final recommendations.
📌 **Demonstration**:
- Run `recommend()` for:
- **User 3** (4th row in `Spotify_Users.txt`)
- **User 19** (20th row in `Spotify_Users.txt`)
- Print **top 20 recommendations** for each user.