Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nikoshet/pyspark-movie-similarities

Using Spark In Python For Movie Similarities With Jaccard Index
https://github.com/nikoshet/pyspark-movie-similarities

jaccard-index movie-similarities pyspark spark

Last synced: about 1 month ago
JSON representation

Using Spark In Python For Movie Similarities With Jaccard Index

Awesome Lists containing this project

README

        

# Use Of PySpark For Movie Similarities With Jaccard Index

## Dataset
The dataset is the MovieLens 100K Dataset that can be found [here](https://grouplens.org/datasets/movielens/). It includes 100,000 ratings from 1000 users on 1700 movies and was released 4/1998. The needed files for the app are uploaded with changed name.

## Requirements
- PySpark

## Example Usage
To find similar movies with 'Star Wars (1977)' movie:
```
spark-submit movie-similarites.py 50
```