https://github.com/ali-jalil88/mlflow-bank-marketing
Bank-Marketing dataset using Mlflow
https://github.com/ali-jalil88/mlflow-bank-marketing
decisiontreeclassifier dessio gaussiannb kneighborsclassifier-model labelencoder logistic-regression mlflow pandas random-forest sklearn-model svc-model
Last synced: 3 months ago
JSON representation
Bank-Marketing dataset using Mlflow
- Host: GitHub
- URL: https://github.com/ali-jalil88/mlflow-bank-marketing
- Owner: Ali-jalil88
- Created: 2024-11-18T13:32:08.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-11-18T14:10:42.000Z (7 months ago)
- Last Synced: 2025-01-30T22:47:38.937Z (5 months ago)
- Topics: decisiontreeclassifier, dessio, gaussiannb, kneighborsclassifier-model, labelencoder, logistic-regression, mlflow, pandas, random-forest, sklearn-model, svc-model
- Language: Jupyter Notebook
- Homepage:
- Size: 542 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bank Marketing Dataset - MLflow Project
1-Table of Contents
2-Introduction
3- Dataset Description
4- Features
5- Setup and Installation
6- Project Workflow
7- MLflow Integration
8- Results and Evaluation
9- Contributing
10- License
# Introduction
The Bank Marketing Dataset MLflow Project is a machine learning project that predicts whether a client will subscribe to a term deposit (deposit as target variable) based on their demographic and interaction data. This repository utilizes MLflow to streamline experiment tracking, reproducibility, and model deployment.# Dataset Description
The dataset used in this project is from the Bank Marketing Dataset on Kaggle.- Source: UCI Machine Learning Repository
- Size: ~45,000 rows and 17 features
- Objective: Predict the outcome of the marketing campaign (deposit: yes/no).
# Features
Key features include:- Demographic Information: age, job, marital, education.
- Campaign Details: campaign, pdays, previous, poutcome.
- Financial Data: balance, loan, housing.
- Date Information: month, day_of_week.
# Setup and Installation
Prerequisites
- Python 3.8 or later
- MLflow installed
- Libraries: pandas, scikit-learn, xgboost, seaborn, matplotlib# Project Workflow
1. Exploratory Data Analysis (EDA):- Visualize distributions, correlations, and outliers.
- Tools: seaborn, matplotlib.
2. Preprocessing:- Handle missing data, encode categorical features, and scale numerical ones.
- Techniques: LabelEncoding, OneHotEncoding, StandardScaler.
3. Model Training:- Models used: Logistic Regression, Random Forest, XGBoost.
- Feature selection and hyperparameter tuning.
4. Evaluation:- Metrics: Accuracy, Precision, Recall, ROC-AUC.
5. Experiment Tracking:
- Log parameters, metrics, and artifacts in MLflow.
6. Model Deployment:- Save the best-performing model for deployment.
# MLflow Integration
What is MLflow?
MLflow is an open-source platform for managing the end-to-end machine learning lifecycle:- Tracking: Log metrics, parameters, and models.
- Projects: Reproducible packaging of code.
- Models: Deployment and sharing of models.
- Registry: Centralized model store.## Links:
- **[Project Notebook](https://www.kaggle.com/code/alialarkawazi/bn-marketing-ml)**
- **[Dataset](https://www.kaggle.com/datasets/janiobachmann/bank-marketing-dataset)**