https://github.com/aadrianleo/book-recommendation-system
Book Recommender System using the Book-Crossing dataset. Compares content-based (TF-IDF + cosine similarity) and collaborative filtering (SVD) methods for book recommendations. Includes data cleaning, EDA, and model evaluation (Precision@5, RMSE) in Python.
https://github.com/aadrianleo/book-recommendation-system
artificial-intelligence collaborative-filtering content-based-filtering cosine-similarity data-preprocessing jupyter jupyter-notebook machine-learning matrix-factorization numpy pandas predictive-modeling python3 recommendation-system recommender-system scikit-learn sckit-surprise
Last synced: about 1 month ago
JSON representation
Book Recommender System using the Book-Crossing dataset. Compares content-based (TF-IDF + cosine similarity) and collaborative filtering (SVD) methods for book recommendations. Includes data cleaning, EDA, and model evaluation (Precision@5, RMSE) in Python.
- Host: GitHub
- URL: https://github.com/aadrianleo/book-recommendation-system
- Owner: AadrianLeo
- Created: 2025-05-19T13:32:40.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-06-03T12:18:07.000Z (8 months ago)
- Last Synced: 2025-07-01T23:36:26.500Z (7 months ago)
- Topics: artificial-intelligence, collaborative-filtering, content-based-filtering, cosine-similarity, data-preprocessing, jupyter, jupyter-notebook, machine-learning, matrix-factorization, numpy, pandas, predictive-modeling, python3, recommendation-system, recommender-system, scikit-learn, sckit-surprise
- Language: Jupyter Notebook
- Homepage: https://github.com/AadrianLeo/Book-Recommendation-System
- Size: 59.7 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Book Recommender System: Content-Based vs. Collaborative Filtering
This project implements and compares two book recommendation approaches—Content-Based Filtering and Collaborative Filtering (SVD)—using the Book-Crossing dataset.
---
## 📚 Project Overview
- **Goal:**
Design and evaluate a recommender system pipeline using real-world book data.
- **Dataset:**
[Book-Crossing Dataset (Kaggle)](https://www.kaggle.com/datasets/saurabhbagchi/books-dataset)
- **Team Members:**
Shadi Farzankia 107209
Shruti Pashine 106369
Dharampal Singh 106316
---
## 🚀 Workflow
1. **Data Loading & Preprocessing:**
- Load books, ratings, and users data.
- Clean and merge datasets, handle missing values and outliers.
2. **Exploratory Data Analysis (EDA):**
- Visualize distributions, check for anomalies, and understand feature relationships.
3. **Recommendation Approaches:**
- **Content-Based Filtering:** Uses book metadata (title, author, publisher) with TF-IDF and cosine similarity.
- **Collaborative Filtering (SVD):** Uses user-book ratings and matrix factorization (Surprise SVD).
4. **Evaluation:**
- Precision@5 (Hit Rate) for both methods.
- RMSE for SVD.
5. **Comparison & Discussion:**
- Compare strengths, weaknesses, and visualize results.
---
## 🗂️ Data
Place the following files in a `data/` directory:
- `books.csv`
- `ratings.csv`
- `users.csv`
---
## 🛠️ How to Run
1. Clone this repository.
2. Install dependencies:
```sh
pip install -r requirements.txt
```
3. Open the notebook (`Code/RecommenderSytems.ipynb`) in Jupyter or VS Code.
4. Run all cells in order.
---
## 📊 Key Findings
- **Content-Based Filtering:**
- Interpretable, works for new/unpopular books, higher hit rate.
- **SVD Collaborative Filtering:**
- More accurate in rating prediction (lower RMSE), more personalized, but needs enough user-book interactions.
---
## ⚠️ Limitations & Future Work
- Data sparsity and cold-start issues for collaborative filtering.
- Evaluation for SVD is limited to a sample of users for computational reasons.
- Future work: hybrid models, more features, deep learning approaches.
---
## 📎 Authors
[@Shruti Pashine](https://github.com/shrutipashine), [@Shadi Farzankia](https://github.com/ShadiFarzankia), [@Dharampal Singh](https://github.com/AadrianLeo)
---
**Dataset:** [Book-Crossing Dataset (Kaggle)](https://www.kaggle.com/datasets/saurabhbagchi/books-dataset)