https://github.com/pramodyasahan/binary-classifier

This repository houses the code for a machine learning model designed to predict customer churn. The model is built using Support Vector Machine (SVM) from the scikit-learn library and incorporates preprocessing, pipeline, and grid search techniques for optimal performance.
https://github.com/pramodyasahan/binary-classifier

numpy pandas scikit-learn

Last synced: 8 months ago
JSON representation

Host: GitHub
URL: https://github.com/pramodyasahan/binary-classifier
Owner: pramodyasahan
Created: 2024-01-06T16:34:18.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-01-06T16:47:25.000Z (about 2 years ago)
Last Synced: 2025-06-02T19:39:47.346Z (8 months ago)
Topics: numpy, pandas, scikit-learn
Language: Jupyter Notebook
Homepage:
Size: 6.78 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Customer Churn Prediction Model

## Overview
This repository houses the code for a machine learning model designed to predict customer churn. The model is built using Support Vector Machine (SVM) from the scikit-learn library and incorporates preprocessing, pipeline, and grid search techniques for optimal performance.

## Data
The model utilizes two datasets for training and testing:
- `train 2.csv`: Contains training data with customer features and churn status.
- `test 2.csv`: Contains testing data with customer features.

## Features
The datasets comprise both numerical and categorical customer-related features. These features undergo preprocessing to make them suitable for feeding into the model.

## Preprocessing
Key preprocessing steps include:
- Scaling numerical features using StandardScaler.
- Encoding categorical features using OneHotEncoder.
- Removal of non-predictive features like 'id', 'CustomerId', and 'Surname'.

## Model Training
The SVM model is trained on preprocessed data. Key aspects of the model training include:
- Utilizing the SVC (Support Vector Classification) algorithm.
- Employing GridSearchCV for hyperparameter tuning.

## Hyperparameters
The following hyperparameters are considered in the grid search:
- 'C': [0.1, 1] (Regularization parameter)
- 'gamma': [1, 0.1] (Kernel coefficient)
- 'kernel': ['rbf', 'poly', 'sigmoid'] (Specifies the kernel type)

## Usage
To use the model:
1. Load the training and testing datasets.
2. Preprocess the datasets by scaling numerical features and encoding categorical features.
3. Train the SVC model using the training data.
4. Predict churn status for the test data.
5. Export the predictions to 'predictions.csv'.

## Dependencies
- numpy
- pandas
- scikit-learn

## Note
This code serves as a basic structure for customer churn prediction and can be further optimized for specific use-cases and datasets.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pramodyasahan/binary-classifier

Awesome Lists containing this project

README