https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction

The model is capable of predicting the ratings of movies
https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction

linear-regression machine-learning pandas python regression ridge-regression wrangler

Last synced: 7 months ago
JSON representation

The model is capable of predicting the ratings of movies

Host: GitHub
URL: https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction
Owner: MdTanvirHossainTusher
License: mit
Created: 2024-03-21T21:18:05.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-03-24T14:52:26.000Z (over 1 year ago)
Last Synced: 2024-03-24T15:49:40.415Z (over 1 year ago)
Topics: linear-regression, machine-learning, pandas, python, regression, ridge-regression, wrangler
Language: Jupyter Notebook
Homepage:
Size: 1.04 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Indian-Movie-Rating-Prediction

# Overview
A regression model that can predict Indian movie rating based on certain features.

# Data Collection

Data has already available [here](https://www.kaggle.com/datasets/adrianmcmahon/imdb-india-movies). Initially dataset contains **15K+** observations and `10 columns`

# Data Preprocessing

Dataset has `NULL` values in some features. Also contains `outliers` and `categorical features` which has to be convert to numerical values.

# Model Training

Dataset is trained using `Linear Regression` and `Ridge Regression` models. As after data processing, it becomes a high dimentional, that's why to reduce the effect of the high dimensionality problem `Ridge Regression` is used.

# Result Analysis
As it is a regression problem, we have measure the `MAE - Mean Absolute Error` of the model. Our model should have improve the `Mean Absolute Error` from baseline MAE.

Model
Baseline MAE
Training MAE
Testing MAE

Logistic Regression
1.109130
0.946608
0.934801

Ridge Regression
1.109130
0.946615
0.934798

We can see, our `Baseline MAE` was `1.109130`. `Logistic Regression` reduced the `Baseline` model's MASE in the `Training MAE` which becomes `0.946608` which is further reduced to `0.934801` in `Testing MAE`. On the other hand, `Training` and `Testing` MAE becomes `0.946615` and `0.934798` in `Ridge Regression`.

From the result, we can say that, both the models performance was so close. But, models could not reduce the MAE much than baseline model.

# Feature Importance

From the image we can see that, `Year` put a crucial role to predict the rating of the Indian movie.

Important Feature

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mdtanvirhossaintusher/indian-movie-rating-prediction

Awesome Lists containing this project

README