https://github.com/yahiazakaria445/ensemble-learning-voting-classifier
Ensemble Learning Using KNN, Naive Bayes, Decision Tree on Biomechanical Data
https://github.com/yahiazakaria445/ensemble-learning-voting-classifier
matplotlib numpy pandas scikit-learn seaborn
Last synced: 4 months ago
JSON representation
Ensemble Learning Using KNN, Naive Bayes, Decision Tree on Biomechanical Data
- Host: GitHub
- URL: https://github.com/yahiazakaria445/ensemble-learning-voting-classifier
- Owner: yahiazakaria445
- Created: 2025-04-08T15:13:16.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-04-08T15:50:08.000Z (10 months ago)
- Last Synced: 2025-06-03T14:43:27.800Z (8 months ago)
- Topics: matplotlib, numpy, pandas, scikit-learn, seaborn
- Language: Jupyter Notebook
- Homepage:
- Size: 1.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Ensemble Learning: Voting Classifier on Biomechanical Data
This project implements an **Ensemble Learning** model using a **Voting Classifier** to predict the class of orthopedic patients based on their biomechanical features combining :
- **K-Nearest Neighbors (KNN)**
- **Gaussian Naive Bayes**
- **Bernoulli Naive Bayes**
- **Decision Tree**
---
## ๐ Dataset Information
- **Source**: [Biomechanical Features of Orthopedic Patients - Kaggle](https://www.kaggle.com/datasets/uciml/biomechanical-features-of-orthopedic-patients/data)
- Each sample in the dataset represents a patient with six biomechanical attributes:
- Pelvic incidence
- Pelvic tilt
- Lumbar lordosis angle
- Sacral slope
- Pelvic radius
- Grade of spondylolisthesis
---
## ๐งฐ Libraries Used
- `numpy`
- `pandas`
- `matplotlib`
- `seaborn`
- `scikit-learn`
---
## ๐ Exploratory Data Analysis (EDA)
- Checked for imbalanced output classes
- Pairplot visualizations for feature relationships
- Correlation matrix to assess feature relationships
- Box plots for each feature
- Distribution plots to visualize the spread of values in each column
---
## ๐งผ Data Preprocessing
- Checked for missing values
- Standardized features using `StandardScaler`
- Balanced the dataset using `RandomOverSampler`
- Split the dataset into training and testing sets using `train_test_split`
---
## ๐ค Classifiers Used in Voting Ensemble
- **K-Nearest Neighbors (KNN)**
- **Gaussian Naive Bayes**
- **Bernoulli Naive Bayes**
- **Decision Tree**
---
## ๐ Testing Metrics
- **Accuracy Score**
- **Confusion Matrix**
---
## ๐ Results
| Model | Accuracy (Train) | Accuracy (Validation) |
|----------------|------------------|------------------------|
| Decision Tree | 0.9441 | 0.9556 |
| Naive Bayes | 0.9058 | 0.9315 |
| KNN | 0.9137 | 0.9153 |
