https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase

Predicting customer insurance purchases using stacking models and SMOTE for the Homesite Quote Conversion Problem on Kaggle.
https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase

k-nearest-neighbours kaggle-competition multilayer-perceptron python random-forest scikit-learn smote support-vector-machines

Last synced: 28 days ago
JSON representation

Predicting customer insurance purchases using stacking models and SMOTE for the Homesite Quote Conversion Problem on Kaggle.

Host: GitHub
URL: https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase
Owner: rakshit-vasava
Created: 2024-10-25T00:31:49.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-10-25T00:35:19.000Z (6 months ago)
Last Synced: 2025-02-11T17:57:19.838Z (3 months ago)
Topics: k-nearest-neighbours, kaggle-competition, multilayer-perceptron, python, random-forest, scikit-learn, smote, support-vector-machines
Language: Jupyter Notebook
Homepage:
Size: 2.74 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # 🌐 Homesite Quote Conversion Problem

This project addresses the **Homesite Quote Conversion Problem** on Kaggle, where the objective is to predict whether a customer will buy an insurance policy based on provided data. The solution involves **SMOTE**, **stacking models**, and **hyperparameter tuning** to achieve optimal predictions.

## 🚀 Project Overview

The project focuses on using machine learning techniques to predict if a customer will purchase insurance. We employ **stacking ensemble models** to combine the strengths of various algorithms and address the class imbalance problem using **SMOTE**.

## 🔨 Key Steps

1. **Experimentation with SMOTE**:  

   Synthetic Minority Over-sampling Technique (SMOTE) was used to handle the imbalance in the dataset. In this project, the ratio was set to 0.5, ensuring that the minority class represents half of the majority class.

2. **Stacking Model Implementation**:  

   A one-layer stacking model was created using five base models:

   - **Decision Tree**

   - **Random Forest**

   - **Support Vector Machines (SVM)**

   - **Multilayer Perceptron (MLP)**

   - **K-Nearest Neighbors (KNN)**

   Stacking creates two datasets:

   - `S_Train`: Stacked predictions from base models for training data.

   - `S_Test`: Stacked predictions for test data.

   A **Random Forest classifier** was used as the final meta-model to make predictions.

3. **Hyperparameter Tuning**:  

   Tuning was performed on the stacked model, particularly the Random Forest meta-classifier, improving the model's accuracy.

4. **Kaggle Submission**:  

   Two submissions were made with the following results:

   - **Without tuning**: Private score of 0.80029, Public score of 0.80351.

   - **With tuning**: Private score of 0.8241, Public score of 0.82155.

## 🛠️ Technologies Used

- **Python**

- **Scikit-learn**

- **SMOTE (Synthetic Minority Over-sampling Technique)**

- **Random Forest**

- **K-Nearest Neighbors**

- **Support Vector Machines (SVM)**

- **Multilayer Perceptron**

- **Gradient Boosting**

- **Kaggle Submission Tools**

## 📝 To-Do

- [x] Experiment with SMOTE and tune the minority class ratio.

- [x] Implement ensemble models and perform stacking.

- [x] Hyperparameter tuning for stacked models.

- [x] Submit to Kaggle and report results.

## 📊 Kaggle Results

| Submission Type         | Private Score | Public Score |

|-------------------------|---------------|--------------|

| Stacked Model (no tuning)| 0.80029       | 0.80351      |

| Stacked Model (tuned)    | 0.8241        | 0.82155      |

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase

Awesome Lists containing this project

README