Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sadegh15khedry/movie-recommendation-system-using-collaborative-filtering
building a movie recommendation system using collaborative filtering techniques.
https://github.com/sadegh15khedry/movie-recommendation-system-using-collaborative-filtering
collaborative-filtering jupyter-notebook matplotlib pandas python recommendation-system scipy seaborn sklearn
Last synced: about 1 month ago
JSON representation
building a movie recommendation system using collaborative filtering techniques.
- Host: GitHub
- URL: https://github.com/sadegh15khedry/movie-recommendation-system-using-collaborative-filtering
- Owner: sadegh15khedry
- License: apache-2.0
- Created: 2024-06-13T11:39:05.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2024-07-12T07:38:38.000Z (4 months ago)
- Last Synced: 2024-10-15T17:29:33.176Z (about 1 month ago)
- Topics: collaborative-filtering, jupyter-notebook, matplotlib, pandas, python, recommendation-system, scipy, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 6.21 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Movie Recommendation System using Collaborative Filtering
This repository contains Python code for building a movie recommendation system using collaborative filtering techniques. Below is a breakdown of the files and functionalities included:
1. [Installation](#installation)
2. [Code Description](#code-description)
3. [Author](#author)
4. [License](#license)## Installation
1. Clone this repository:
```bash
git clone https://github.com/sadegh15khedry/MovieRecommendationSystem.git
cd Movie-Recommendation-System-Using-Collaborative-Filtering
```2. Install the required libraries using the environment.yml file using conda:
```bash
conda env create -f environment.yml
```3. Download the movieLens datasets (`movies.csv`, `tags.csv`, `ratings.csv`) and update the path to them in the code.
4. Run the `recommendation_system.ipynb` notebook to generate movie recommendations.
## Code Description
### 1. Data Loading and Preprocessing
- Load the datasets (`movies.csv`, `tags.csv`, `ratings.csv`) using pandas.
- Select relevant columns (`tag_df`, `rating_df`, `movie_df`) for further analysis.
- Perform exploratory data analysis (EDA) to understand data shapes, missing values, duplicates, and basic statistics.### 2. Data Aggregation
- Merge `rating_df` and `movie_df` on `movieId` to create a combined DataFrame (`df`).
- Aggregate ratings to find movies with more than 100 ratings (`agg_df`).
- Merge `df` with `agg_df` to filter out less popular movies (`df_gt100`).### 3. User-Movie Matrix
- Create a user-movie matrix (`user_movie_matrix`) using `pivot_table`, where rows represent users, columns represent movies, and values represent ratings.
### 4. Normalization and Similarity Calculation
- Normalize `user_movie_matrix` (`matrix_norm`) by subtracting the mean rating of each user.
- Calculate cosine similarities (`user_similarity` and `movie_similarity`) based on `matrix_norm` to find similar users and movies.### 5. Recommendation Process
- Select a user (`picked_userId`) and set up variables (`number_of_simlar_users`, `user_similarity_threshold`).
- Find similar users (`similar_users`) based on a similarity threshold.
- Identify movies watched by the selected user (`picked_user_watched`) and similar users (`similar_users_movies`).
- Calculate item scores (`item_score`) based on weighted sums of ratings from similar users.
- Sort and print top recommended movies (`ranked_item_score`) based on their scores.## Results and Outputs
- The script outputs top recommended movies for a selected user (`picked_userId`) based on collaborative filtering.
- Evaluation metrics (e.g., precision, recall) and visualizations (e.g., heatmap of similarity matrices) can be added for performance analysis.## Further Improvements
- Implement evaluation metrics to quantify the performance of the recommendation system.
- Optimize code efficiency for larger datasets and real-time recommendations.
- Incorporate content-based filtering or hybrid approaches for improved recommendation accuracy.## Author
- Sadegh Khedry
## LicenseThis project is licensed under the Apache-2.0 License - see the LICENSE.md file for details.