Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arif-miad/obesity-level-classification-using-machine-learning
https://github.com/arif-miad/obesity-level-classification-using-machine-learning
classification exploratory-data-analysis feature-engineering machine-learning matplotlib numpy pandas python sklearn
Last synced: 4 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/arif-miad/obesity-level-classification-using-machine-learning
- Owner: Arif-miad
- License: apache-2.0
- Created: 2025-02-14T17:23:43.000Z (7 days ago)
- Default Branch: main
- Last Pushed: 2025-02-14T17:34:29.000Z (7 days ago)
- Last Synced: 2025-02-14T18:28:56.363Z (7 days ago)
- Topics: classification, exploratory-data-analysis, feature-engineering, machine-learning, matplotlib, numpy, pandas, python, sklearn
- Homepage:
- Size: 2.18 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Obesity Level Classification using Machine Learning
## π Overview
This project aims to predict obesity levels in individuals from **Mexico, Peru, and Colombia** based on their eating habits and physical conditions. The dataset consists of **2111 records** with **17 attributes**, including details like BMI, family history, eating habits, physical activity, and transportation mode.## π Dataset Features
- **Gender:** Male/Female
- **Age, Height, Weight:** Physical attributes
- **Eating Habits:** Frequency of high-caloric food, vegetable intake, water consumption, snacking behavior
- **Lifestyle Factors:** Smoking, alcohol consumption, exercise frequency, screen time, transportation mode
- **Target Variable:** **Obesity Level** (7 categories: Insufficient Weight, Normal Weight, Overweight I & II, Obesity I, II & III)---
## π Project Workflow### **1οΈβ£ Exploratory Data Analysis (EDA)**
- **Univariate Analysis:** Distribution of numerical & categorical features
- **Bivariate Analysis:** Relationship between variables using scatterplots, box plots, and bar plots
- **Correlation Analysis:** Heatmaps to identify feature relationships
- **Outlier Detection:** Identifying extreme values in numerical features### **2οΈβ£ Data Preprocessing & Feature Engineering**
- Encoding categorical variables using **Label Encoding**
- Feature creation: **Body Mass Index (BMI)**
- Train-test split (80-20 ratio)
- Scaling numerical features using **StandardScaler**### **3οΈβ£ Machine Learning Model Training (10 Models)**
We implemented and compared **10 classification models:**
- Logistic Regression
- Random Forest
- Gradient Boosting
- Support Vector Machine (SVM)
- K-Nearest Neighbors (KNN)
- Decision Tree
- NaΓ―ve Bayes
- XGBoost
- LightGBM
- CatBoost### **4οΈβ£ Model Evaluation & Performance Metrics**
Each model was evaluated using:
- **Accuracy, F1 Score, ROC-AUC Score**
- **Confusion Matrix for Misclassification Analysis**
- **Classification Report for Precision, Recall, F1-Score**
- **Feature Importance Analysis** for tree-based models### **5οΈβ£ Model Comparison & Optimization**
- Performance comparison across models
- Hyperparameter tuning using **GridSearchCV & RandomizedSearchCV**---
## π Results & Insights
- **Best Performing Model:** Identified based on accuracy & ROC-AUC
- **Feature Importance:** Key factors influencing obesity prediction
- **Impact of Lifestyle Factors:** Strong correlation with obesity levels---
## π Repository Structure
```
π¦ Obesity_Level_Classification
βββ π data # Dataset & Preprocessed Files
βββ π notebooks # Jupyter Notebooks for EDA & Modeling
βββ π models # Trained Machine Learning Models
βββ π obesity_classification.py # Main Code Implementation
βββ π README.md # Project Documentation
```---
## π Kaggle Notebook & LinkedIn Profile
π **Kaggle Notebook:** [Check it out here](https://www.kaggle.com/code/arifmia/predicting-obesity-levels-using-machine-learning)π **LinkedIn Profile:** [Connect with me](www.linkedin.com/in/arif-miah-8751bb217)
---
## π‘ Future Improvements
- **Deep Learning Approach:** Experimenting with Neural Networks
- **More Features:** Incorporating dietary habits, sleep patterns, and medical history
- **Deployment:** Creating a web-based prediction tool using Flask or Streamlit### β **If you found this helpful, don't forget to star the repository!** β