Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/spycsh/movierecommendersystem
A movie recommender system
https://github.com/spycsh/movierecommendersystem
collaborative-filtering kafka-streams mongodb recommender-system redis spark tf-idf
Last synced: 14 days ago
JSON representation
A movie recommender system
- Host: GitHub
- URL: https://github.com/spycsh/movierecommendersystem
- Owner: Spycsh
- Created: 2020-12-11T14:08:40.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-11-10T11:16:51.000Z (over 3 years ago)
- Last Synced: 2024-12-05T03:25:48.720Z (2 months ago)
- Topics: collaborative-filtering, kafka-streams, mongodb, recommender-system, redis, spark, tf-idf
- Language: Scala
- Homepage:
- Size: 2.29 MB
- Stars: 1
- Watchers: 1
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# RecommenderSystem
## Modules introduction
Here gives brief introduction of different modules
### DataLoader
Dataset source: [MovieLens](https://grouplens.org/datasets/movielens/)
preprocess the `movies.csv` and `ratings.csv` and store in MongoDB
### StatisticsRecommender
Recommend movies based directly on statistics, and
use Spark Core + Spark SQL to implement the statistics recommender to find:- hottest movie (with most ratings)
- Recently hottest movies (group by month, then by ratings, DESC)
- Top Movies (with highest average rating)
- Each genre top movie (cross table)### OfflineRecommender
![offline](./images/offline_recommender_workflow.png)
Recommend based on Collaborative filtering, and
use Spark Core + Spark MLlib and ALS to implement offline recommender- from latent features of users, recommend a list of movies for a user (use ALS algorithm)
- from the similarity of movies, recommend a list of similar movies for a movie (use cosine similarity)### StreamingRecommender
![realtime](./images/realtime_recommender_workflow.png)
Recommend in real-time, by collecting one single rating behavior of user in real-time send to Kafka,
and process, compute the real-time recommendation list to update the MongoDB* get the latest K times of rating from redis
* from similarity matrix, extract N most similar movies as the candidate list
* for every candidate movie, calculate the score and sort as current user's recommendation list