https://github.com/itsjafer/tv-show-recommendations
Machine learning pipeline trained offline that, given a TV Show, recommends 10 similar TV Shows using cosine similarities based on a variety of features
https://github.com/itsjafer/tv-show-recommendations
data engine learning machine python recommendation science tv-shows
Last synced: 5 months ago
JSON representation
Machine learning pipeline trained offline that, given a TV Show, recommends 10 similar TV Shows using cosine similarities based on a variety of features
- Host: GitHub
- URL: https://github.com/itsjafer/tv-show-recommendations
- Owner: itsjafer
- Created: 2019-05-24T20:43:12.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-27T14:55:37.000Z (almost 3 years ago)
- Last Synced: 2025-04-09T08:02:47.850Z (8 months ago)
- Topics: data, engine, learning, machine, python, recommendation, science, tv-shows
- Language: Python
- Homepage: http://itsjafer.com/#/show-predictor
- Size: 227 MB
- Stars: 17
- Watchers: 1
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Recommendation Engine

Using scraped data from Metacritic and IMDB, this model will take a TV Show as input and return 10 others that are recommended based on that show. Written in Python using mostly selectolax, scikit-learn, fuzzywuzzy, nltk, and pandas.
## Model
The underlying model uses the following features (each weighted differently):
* Genres
* Plot
* Synopsis
* Cast
* Production company
* Keywords (describing the show)
* Number of seasons
* Episode runtime
* IMDB Rating
* Metacritic score
* Metacritic user score
Using these, the model finds pairwise cosine_similarities between every TV Show in the database. Combining the top 30 most similar with a weighted average of IMDB and metacritic scores gives an overall recommendation score.
## Demo
You can see a demo of prediction on my [website](http://itsjafer.com/#/show-predictor).
## Limitations
Metacritic scores are based on the first season (this is why metascores carry a lower weight). In the future, we need to scrape data for the entire show or the average of all seasons.
## Setup
`metacritic_scraper.py` must run before `imdb_scraper`. Following this, the training and predicting can be done.
You can either run the main file in a terminal or run this as a flask webserver using `flask_api`.