https://github.com/desininja/vehicle-insurance

Data science project
https://github.com/desininja/vehicle-insurance

Last synced: 9 months ago
JSON representation

Data science project

Host: GitHub
URL: https://github.com/desininja/vehicle-insurance
Owner: desininja
Created: 2020-06-11T07:17:57.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2021-06-30T21:00:24.000Z (over 4 years ago)
Last Synced: 2025-01-02T21:17:03.365Z (11 months ago)
Language: Jupyter Notebook
Homepage:
Size: 670 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Vehicle-insurance

Vehicle Insurance data:
This dataset contains multiple features according to the customer’s vehicle and insurance type.

OBJECTIVE: Business requirement is to increase the clv (customer lifetime value) that means clv is the target variable.

Data Cleansing:

This dataset is pretty clean already, a few outliers are there. Remove the outliers.

Why remove Outliers?
Outliers are unusual values in dataset, and they can distort statistical analyses and violate their assumptions.

Feature selection:

This step is required to remove unwanted features.

VIF and Correlation Coefficient can be used to find important features.

VIF: Variance Inflation Factor
It is a measure of collinearity among predictor variables within a multiple regression. It is calculated by taking the the ratio of the variance of all a given model's betas divide by the variance of a single beta if it were fit alone.

Correlation Coefficient:
A positive Pearson coefficient mean that one variable's value increases with the others. And a negative Pearson coefficient means one variable decreases as other variable decreases. Correlations coefficients of -1 or +1 mean the relationship is exactly linear.

Log transformation and Normalisation:
Many ML algorithms perform better or converge faster when features are on a relatively similar scale and/or close to normally distributed.

Applying different ML Algorithms to the dataset for predictions. Their accuracies are in notebook.

Please see my work. And I am open to suggestion.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/desininja/vehicle-insurance

Awesome Lists containing this project

README