https://github.com/kaoutarmi/diabetes-prediction-system
Machine learning approach to detect whether patien has the diabetes or not. Data cleaning, visualization, modeling and cross validation applied
https://github.com/kaoutarmi/diabetes-prediction-system
accuracy detection diabetes diabetes-detection diabetes-prediction health-data knn-classification logistic-regression machine-learning machine-learning-algorithms naive-bayes-classifier random-forest-classifier support-vector-machines
Last synced: 7 months ago
JSON representation
Machine learning approach to detect whether patien has the diabetes or not. Data cleaning, visualization, modeling and cross validation applied
- Host: GitHub
- URL: https://github.com/kaoutarmi/diabetes-prediction-system
- Owner: kaoutarmi
- Created: 2024-11-20T11:51:41.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-02-26T13:05:09.000Z (8 months ago)
- Last Synced: 2025-02-26T14:22:33.188Z (8 months ago)
- Topics: accuracy, detection, diabetes, diabetes-detection, diabetes-prediction, health-data, knn-classification, logistic-regression, machine-learning, machine-learning-algorithms, naive-bayes-classifier, random-forest-classifier, support-vector-machines
- Language: Jupyter Notebook
- Homepage:
- Size: 3.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ฉบ Diabetes Prediction Using Machine Learning
Welcome to the **Diabetes Prediction** project! ๐ The goal of this project is to predict whether a person has diabetes using various **machine learning algorithms**. We focus on applying data cleaning, visualization, and modeling techniques to build accurate prediction models.
## ๐ฏ Objective
- ๐งน **Clean the data** to ensure high-quality inputs.
- ๐ **Visualize the data** to better understand patterns and correlations.
- ๐ค **Train machine learning models** to predict diabetes.
- ๐ก **Evaluate model performance** using several evaluation metrics.## ๐ ๏ธ Techniques Used
1. **Data Cleaning** ๐งผ: Removing missing values, handling outliers, and preparing the data for modeling.
2. **Data Visualization** ๐: Analyzing and visualizing the data to understand patterns and trends.
3. **Machine Learning Modeling** ๐ค: Training multiple machine learning models to predict diabetes.## ๐ง Algorithms Used
1. **Logistic Regression** ๐งโ๐ผ
2. **Support Vector Machine (SVM)** ๐ฒ
3. **K-Nearest Neighbors (KNN)** ๐
4. **Random Forest Classifier** ๐ณ
5. **Naive Bayes** ๐งโ๐ฌ
6. **Gradient Boosting** ๐ฅ## ๐ Model Evaluation Methods Used
1. **Accuracy Score** โ : Measures how often the model makes correct predictions.
2. **ROC AUC Curve** ๐: Evaluates the trade-off between true positive rate and false positive rate.
3. **Cross-Validation** ๐: Splitting the data into different subsets to ensure the model performs well on unseen data.
4. **Confusion Matrix** ๐: Provides a breakdown of prediction errors, including false positives, false negatives, true positives, and true negatives.## ๐ฆ Dependencies
To run this project, you will need the following libraries:
- `pandas` ๐
- `numpy` ๐ข
- `matplotlib` ๐
- `seaborn` ๐จ
- `scikit-learn` ๐งโ๐ปYou can install the dependencies by running:
```bash
pip install pandas numpy matplotlib seaborn scikit-learn