An open API service indexing awesome lists of open source software.

https://github.com/desininja/vehicle-insurance

Data science project
https://github.com/desininja/vehicle-insurance

Last synced: 9 months ago
JSON representation

Data science project

Awesome Lists containing this project

README

          

# Vehicle-insurance

Vehicle Insurance data:
This dataset contains multiple features according to the customer’s vehicle and insurance type.

OBJECTIVE: Business requirement is to increase the clv (customer lifetime value) that means clv is the target variable.

Data Cleansing:

This dataset is pretty clean already, a few outliers are there. Remove the outliers.

Why remove Outliers?
Outliers are unusual values in dataset, and they can distort statistical analyses and violate their assumptions.

Feature selection:

This step is required to remove unwanted features.

VIF and Correlation Coefficient can be used to find important features.

VIF: Variance Inflation Factor
It is a measure of collinearity among predictor variables within a multiple regression. It is calculated by taking the the ratio of the variance of all a given model's betas divide by the variance of a single beta if it were fit alone.

Correlation Coefficient:
A positive Pearson coefficient mean that one variable's value increases with the others. And a negative Pearson coefficient means one variable decreases as other variable decreases. Correlations coefficients of -1 or +1 mean the relationship is exactly linear.

Log transformation and Normalisation:
Many ML algorithms perform better or converge faster when features are on a relatively similar scale and/or close to normally distributed.

Applying different ML Algorithms to the dataset for predictions. Their accuracies are in notebook.

Please see my work. And I am open to suggestion.