https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction
The model is capable of predicting the ratings of movies
https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction
linear-regression machine-learning pandas python regression ridge-regression wrangler
Last synced: 7 months ago
JSON representation
The model is capable of predicting the ratings of movies
- Host: GitHub
- URL: https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction
- Owner: MdTanvirHossainTusher
- License: mit
- Created: 2024-03-21T21:18:05.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-03-24T14:52:26.000Z (over 1 year ago)
- Last Synced: 2024-03-24T15:49:40.415Z (over 1 year ago)
- Topics: linear-regression, machine-learning, pandas, python, regression, ridge-regression, wrangler
- Language: Jupyter Notebook
- Homepage:
- Size: 1.04 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Indian-Movie-Rating-Prediction
# Overview
A regression model that can predict Indian movie rating based on certain features.# Data Collection
Data has already available [here](https://www.kaggle.com/datasets/adrianmcmahon/imdb-india-movies). Initially dataset contains **15K+** observations and `10 columns`
# Data Preprocessing
Dataset has `NULL` values in some features. Also contains `outliers` and `categorical features` which has to be convert to numerical values.
# Model Training
Dataset is trained using `Linear Regression` and `Ridge Regression` models. As after data processing, it becomes a high dimentional, that's why to reduce the effect of the high dimensionality problem `Ridge Regression` is used.
# Result Analysis
As it is a regression problem, we have measure the `MAE - Mean Absolute Error` of the model. Our model should have improve the `Mean Absolute Error` from baseline MAE.
Model
Baseline MAE
Training MAE
Testing MAE
Logistic Regression
1.109130
0.946608
0.934801
Ridge Regression
1.109130
0.946615
0.934798
We can see, our `Baseline MAE` was `1.109130`. `Logistic Regression` reduced the `Baseline` model's MASE in the `Training MAE` which becomes `0.946608` which is further reduced to `0.934801` in `Testing MAE`. On the other hand, `Training` and `Testing` MAE becomes `0.946615` and `0.934798` in `Ridge Regression`.
From the result, we can say that, both the models performance was so close. But, models could not reduce the MAE much than baseline model.
# Feature Importance
From the image we can see that, `Year` put a crucial role to predict the rating of the Indian movie.
![]()