Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/spycsh/movierecommendersystem

A movie recommender system
https://github.com/spycsh/movierecommendersystem

collaborative-filtering kafka-streams mongodb recommender-system redis spark tf-idf

Last synced: 14 days ago
JSON representation

A movie recommender system

Host: GitHub
URL: https://github.com/spycsh/movierecommendersystem
Owner: Spycsh
Created: 2020-12-11T14:08:40.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-11-10T11:16:51.000Z (over 3 years ago)
Last Synced: 2024-12-05T03:25:48.720Z (2 months ago)
Topics: collaborative-filtering, kafka-streams, mongodb, recommender-system, redis, spark, tf-idf
Language: Scala
Homepage:
Size: 2.29 MB
Stars: 1
Watchers: 1
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# RecommenderSystem

## Modules introduction

Here gives brief introduction of different modules

### DataLoader

Dataset source: [MovieLens](https://grouplens.org/datasets/movielens/)

preprocess the `movies.csv` and `ratings.csv` and store in MongoDB

### StatisticsRecommender

Recommend movies based directly on statistics, and
use Spark Core + Spark SQL to implement the statistics recommender to find:

- hottest movie (with most ratings)
- Recently hottest movies (group by month, then by ratings, DESC)
- Top Movies (with highest average rating)
- Each genre top movie (cross table)

### OfflineRecommender

![offline](./images/offline_recommender_workflow.png)

Recommend based on Collaborative filtering, and
use Spark Core + Spark MLlib and ALS to implement offline recommender

- from latent features of users, recommend a list of movies for a user (use ALS algorithm)
- from the similarity of movies, recommend a list of similar movies for a movie (use cosine similarity)

### StreamingRecommender

![realtime](./images/realtime_recommender_workflow.png)

Recommend in real-time, by collecting one single rating behavior of user in real-time send to Kafka,
and process, compute the real-time recommendation list to update the MongoDB

* get the latest K times of rating from redis
* from similarity matrix, extract N most similar movies as the candidate list
* for every candidate movie, calculate the score and sort as current user's recommendation list