https://github.com/arif-miad/obesity-classification-project
This project aims to predict obseity levels based on eating habits.
https://github.com/arif-miad/obesity-classification-project
feature-engineering keras-tensorflow machine-learning matplotlib-pyplot pandas-dataframe seaborn sklearn
Last synced: 2 months ago
JSON representation
This project aims to predict obseity levels based on eating habits.
- Host: GitHub
- URL: https://github.com/arif-miad/obesity-classification-project
- Owner: Arif-miad
- License: mit
- Created: 2025-03-10T11:18:47.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-03-10T11:25:09.000Z (8 months ago)
- Last Synced: 2025-03-10T12:28:37.931Z (8 months ago)
- Topics: feature-engineering, keras-tensorflow, machine-learning, matplotlib-pyplot, pandas-dataframe, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 3.56 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Obesity Classification Project
This project aims to predict obesity levels based on eating habits, family history, and physical condition using various machine learning classification models. The dataset includes data from individuals in Mexico, Peru, and Colombia, covering 16 lifestyle and health-related features with 2111 records. The project involves data preprocessing, visualization, model building, evaluation, and comparison of top classification algorithms.
## Dataset
The dataset consists of 17 columns:
- Gender
- Age
- Height
- Weight
- family_history_with_overweight
- FAVC
- FCVC
- NCP
- CAEC
- SMOKE
- CH2O
- SCC
- FAF
- TUE
- CALC
- MTRANS
- NObeyesdad (Target variable)
## Project Workflow
1. **Data Preprocessing:** Handling missing values, encoding categorical variables, and normalizing data.
2. **Exploratory Data Analysis (EDA):**
- Univariate and Multivariate Analysis
- Plots: Countplot, Histogram, Barplot, Boxplot, Violin plot, Line plot, Pie chart, KDE plot, etc.
3. **Feature Engineering:** Label encoding for categorical columns.
4. **Model Building:**
- Top 10 classification models
- Comparison of models with metrics like accuracy, F1-score, precision, and recall
5. **Model Evaluation:**
- Classification Report
- Confusion Matrix
- ROC and AUC Curve
- Feature Importance
6. **Model Comparison:** Visualizing and comparing the performance of different models.
## Installation
Make sure you have the following libraries installed:
```
pip install pandas numpy matplotlib seaborn scikit-learn
```
## Usage
Simply run the notebook to see the complete analysis and model evaluation.
## Contact
- **Email:** arifmiahcse@gmail.com
- **Kaggle:** [Arif Mia's Kaggle Profile](https://www.kaggle.com/code/miadul/obesity-level-prediction-exploring-lifestyle-fact)
- **LinkedIn:** [Arif Mia's LinkedIn Profile](www.linkedin.com/in/arif-miah-8751bb217)