https://github.com/workwithchaimaa/codealpha_diseaseprediction

Complete ML pipeline for binary classification to predict heart disease. Includes data preprocessing, model comparison (Logistic Regression, RF), hyperparameter tuning, and feature importance analysis.
https://github.com/workwithchaimaa/codealpha_diseaseprediction

classification heart-disease machine-learning python random-forest scikit-learn

Last synced: 9 months ago
JSON representation

Host: GitHub
URL: https://github.com/workwithchaimaa/codealpha_diseaseprediction
Owner: WorkWithChaimaa
Created: 2025-10-05T18:14:33.000Z (9 months ago)
Default Branch: master
Last Pushed: 2025-10-05T18:41:21.000Z (9 months ago)
Last Synced: 2025-10-05T20:39:56.822Z (9 months ago)
Topics: classification, heart-disease, machine-learning, python, random-forest, scikit-learn
Language: Python
Homepage:
Size: 70.3 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Heart Disease Prediction: Machine Learning Project
Complete ML pipeline for binary classification to predict heart disease. Includes data preprocessing, model comparison (Logistic Regression, RF), hyperparameter tuning, and feature importance analysis.

## Project Overview
This project applies machine learning classification algorithms to predict the presence of heart disease using the UCI Heart Disease Dataset. The pipeline includes extensive data preprocessing, model comparison, hyperparameter tuning, and detailed performance analysis.

## Key Outcomes

---

## Top Predictive Features (Feature Importance)

The model identified the most influential factors for heart disease prediction:
1. **Thal\_Reversable Defect:** The most important factor, indicating insufficient blood flow during stress.
2. **Oldpeak:** ST depression from exercise relative to rest.
3. **Thal\_Normal:** Indicating the absence of a defect.
4. **Thalch:** Maximum heart rate achieved.

## Repository Contents
- **`Disease_Prediction.py`**: The complete Python script containing the ML pipeline.
- **`heart_disease_data.csv`**: The dataset used for training and testing.
- **`plots/`**: Contains the generated visualizations.
- `feature_importance.png`
- `tuned_random_forest_confusion_matrix.png`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/workwithchaimaa/codealpha_diseaseprediction

Awesome Lists containing this project

README