{"id":28923941,"url":"https://github.com/krish57-bit/diabetes-prediction-","last_synced_at":"2026-05-07T17:33:56.739Z","repository":{"id":299640409,"uuid":"1003686530","full_name":"krish57-bit/Diabetes-Prediction-","owner":"krish57-bit","description":"A comprehensive machine learning pipeline to predict the onset of diabetes using the PIMA Indian Diabetes dataset. This includes data cleaning, visualization, outlier detection, standardization, SMOTE-based imbalance handling, and multiple classification algorithms (Logistic Regression, Naive Bayes, and KNN).","archived":false,"fork":false,"pushed_at":"2025-06-17T14:24:22.000Z","size":417,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-17T15:28:36.567Z","etag":null,"topics":["classification","data-science","diabetes","healthcare","jupyter-notebook","machine-learning","python","scikit-learn","smote"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/krish57-bit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-17T14:11:42.000Z","updated_at":"2025-06-17T14:24:25.000Z","dependencies_parsed_at":"2025-06-17T15:39:35.978Z","dependency_job_id":null,"html_url":"https://github.com/krish57-bit/Diabetes-Prediction-","commit_stats":null,"previous_names":["krish57-bit/diabetes-prediction-"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/krish57-bit/Diabetes-Prediction-","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/krish57-bit%2FDiabetes-Prediction-","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/krish57-bit%2FDiabetes-Prediction-/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/krish57-bit%2FDiabetes-Prediction-/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/krish57-bit%2FDiabetes-Prediction-/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/krish57-bit","download_url":"https://codeload.github.com/krish57-bit/Diabetes-Prediction-/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/krish57-bit%2FDiabetes-Prediction-/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265710779,"owners_count":23815406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","data-science","diabetes","healthcare","jupyter-notebook","machine-learning","python","scikit-learn","smote"],"created_at":"2025-06-22T10:02:22.230Z","updated_at":"2026-05-07T17:33:56.708Z","avatar_url":"https://github.com/krish57-bit.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 Diabetes Prediction using Machine Learning\n\nThis project demonstrates a full machine learning workflow to predict diabetes using the **PIMA Indian Diabetes Dataset**. It includes detailed data preprocessing, visualization, outlier handling, and classification using various ML models.\n\n---\n\n## 📁 Dataset\n\n- **Source**: [PIMA Indian Diabetes Dataset](https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database)\n- **Size**: 768 samples, 8 features + 1 binary target (`Outcome`)\n- **Target**: `Outcome` (1 = Diabetic, 0 = Non-Diabetic)\n\n---\n\n## ⚙️ Workflow Overview\n\n### 1. 📥 Data Loading \u0026 Exploration\n- Checked for missing/zero values\n- Descriptive statistics using `.describe()`\n- Correlation heatmap (`sns.heatmap`)\n\n### 2. 🛠️ Data Imputation\n- Replaced 0s in critical columns (Insulin, BMI, etc.) with median/mean values based on distribution\n\n### 3. 📦 Outlier Detection\n- Used IQR method\n- Boxplot visualization for identifying outliers\n\n### 4. 🧼 Feature Scaling\n- Applied `StandardScaler` to normalize all features\n\n### 5. 🧪 Train-Test Split\n- Used 67% training and 33% testing with `train_test_split()`\n\n### 6. ⚖️ Imbalanced Data Handling\n- Applied **SMOTE (Synthetic Minority Oversampling Technique)** to balance the target classes\n\n### 7. 🔍 Model Training\n#### ✅ Logistic Regression\n#### ✅ Gaussian Naive Bayes\n#### ✅ K-Nearest Neighbors (KNN)\n\n### 8. 📈 Model Evaluation\n- Accuracy, Confusion Matrix, and Classification Report (Precision, Recall, F1-score)\n\n---\n\n## 📊 Visualizations\n\n- ✅ Correlation Matrix Heatmap\n- ✅ Feature Distribution using `sns.distplot`\n- ✅ Boxplots before and after standardization\n\nAll saved under `images/`.\n\n---\n\n## 🧠 Models\n\n| Model                | Evaluation Metric     | Balanced with SMOTE |\n|---------------------|------------------------|----------------------|\n| Logistic Regression | Accuracy + Recall      | ✅                   |\n| Gaussian NB         | Confusion Matrix + F1  | ✅                   |\n| KNN Classifier      | Accuracy + Classification Report | ✅         |\n\n---\n\n## 🧾 Requirements\n\nInstall with:\n\n```bash\npip install -r requirements.txt\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkrish57-bit%2Fdiabetes-prediction-","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkrish57-bit%2Fdiabetes-prediction-","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkrish57-bit%2Fdiabetes-prediction-/lists"}