An open API service indexing awesome lists of open source software.

https://github.com/kaoutarmi/diabetes-prediction-system

Machine learning approach to detect whether patien has the diabetes or not. Data cleaning, visualization, modeling and cross validation applied
https://github.com/kaoutarmi/diabetes-prediction-system

accuracy detection diabetes diabetes-detection diabetes-prediction health-data knn-classification logistic-regression machine-learning machine-learning-algorithms naive-bayes-classifier random-forest-classifier support-vector-machines

Last synced: 7 months ago
JSON representation

Machine learning approach to detect whether patien has the diabetes or not. Data cleaning, visualization, modeling and cross validation applied

Awesome Lists containing this project

README

          

# ๐Ÿฉบ Diabetes Prediction Using Machine Learning

Welcome to the **Diabetes Prediction** project! ๐ŸŽ‰ The goal of this project is to predict whether a person has diabetes using various **machine learning algorithms**. We focus on applying data cleaning, visualization, and modeling techniques to build accurate prediction models.

## ๐ŸŽฏ Objective

- ๐Ÿงน **Clean the data** to ensure high-quality inputs.
- ๐Ÿ“Š **Visualize the data** to better understand patterns and correlations.
- ๐Ÿค– **Train machine learning models** to predict diabetes.
- ๐Ÿ’ก **Evaluate model performance** using several evaluation metrics.

## ๐Ÿ› ๏ธ Techniques Used

1. **Data Cleaning** ๐Ÿงผ: Removing missing values, handling outliers, and preparing the data for modeling.
2. **Data Visualization** ๐Ÿ“Š: Analyzing and visualizing the data to understand patterns and trends.
3. **Machine Learning Modeling** ๐Ÿค–: Training multiple machine learning models to predict diabetes.

## ๐Ÿง  Algorithms Used

1. **Logistic Regression** ๐Ÿง‘โ€๐Ÿ’ผ
2. **Support Vector Machine (SVM)** ๐Ÿ”ฒ
3. **K-Nearest Neighbors (KNN)** ๐Ÿ”
4. **Random Forest Classifier** ๐ŸŒณ
5. **Naive Bayes** ๐Ÿง‘โ€๐Ÿ”ฌ
6. **Gradient Boosting** ๐Ÿ”ฅ

## ๐Ÿ“ˆ Model Evaluation Methods Used

1. **Accuracy Score** โœ…: Measures how often the model makes correct predictions.
2. **ROC AUC Curve** ๐Ÿ“‰: Evaluates the trade-off between true positive rate and false positive rate.
3. **Cross-Validation** ๐Ÿ”„: Splitting the data into different subsets to ensure the model performs well on unseen data.
4. **Confusion Matrix** ๐Ÿ“Š: Provides a breakdown of prediction errors, including false positives, false negatives, true positives, and true negatives.

## ๐Ÿ“ฆ Dependencies

To run this project, you will need the following libraries:

- `pandas` ๐Ÿ“‘
- `numpy` ๐Ÿ”ข
- `matplotlib` ๐Ÿ“Š
- `seaborn` ๐ŸŽจ
- `scikit-learn` ๐Ÿง‘โ€๐Ÿ’ป

You can install the dependencies by running:

```bash
pip install pandas numpy matplotlib seaborn scikit-learn