https://github.com/ornella-gigante/cardiovascular-disease-prediction-project
This ML project predicts cardiovascular diseases using clinical data (blood pressure, cholesterol, heart rate). Implemented Decision Tree (71.15%) and Gaussian Naive Bayes (70.49%) models on 13 medical features.
https://github.com/ornella-gigante/cardiovascular-disease-prediction-project
binary classification desiciontree gaussian-processes machine-learning prediction-model
Last synced: about 2 months ago
JSON representation
This ML project predicts cardiovascular diseases using clinical data (blood pressure, cholesterol, heart rate). Implemented Decision Tree (71.15%) and Gaussian Naive Bayes (70.49%) models on 13 medical features.
- Host: GitHub
- URL: https://github.com/ornella-gigante/cardiovascular-disease-prediction-project
- Owner: Ornella-Gigante
- License: gpl-3.0
- Created: 2024-02-28T18:08:05.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-02-07T09:34:07.000Z (10 months ago)
- Last Synced: 2025-06-28T03:33:22.178Z (5 months ago)
- Topics: binary, classification, desiciontree, gaussian-processes, machine-learning, prediction-model
- Language: Jupyter Notebook
- Homepage:
- Size: 54.7 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Hereβs a detailed README for this cardiovascular disease prediction project, incorporating insights from the Jupyter notebook analysis:
---
# π©Ί Cardiovascular Disease Prediction Project
*By Ornella SofΓa Gigante*
## π Project Overview
This project aims to predict cardiovascular diseases using patient clinical data. Two machine learning models (Decision Tree and Gaussian Naive Bayes) were implemented and compared for binary classification (disease present/absent).
---
## π Dataset
**Key Features Analyzed**:
- Age
- Gender (1 = male)
- Chest pain type (0-3 scale)
- Resting blood pressure (mmHg)
- Serum cholesterol (mg/dl)
- Fasting blood sugar
- Resting electrocardiogram results
- Maximum heart rate
- Exercise-induced angina
- ST depression (oldpeak)
- Slope of peak exercise ST segment
- Number of major vessels
**Target Variable**:
- `target` (1 = disease detected, 0 = no disease)
---
## π€ Models Used
### π³ Decision Tree Classifier
- Achieved **71.15% accuracy**
- Advantages: Simple interpretation, handles non-linear relationships
### π§ͺ Gaussian Naive Bayes
- Achieved **70.49% accuracy**
- Advantages: Fast computation, works well with small datasets
---
## π Key Results
| Metric | Decision Tree | Naive Bayes |
|-----------------|---------------|-------------|
| **Accuracy** | 71.15% | 70.49% |
Both models showed similar performance, suggesting room for improvement through:
- Feature engineering
- Hyperparameter tuning
- Larger/more balanced datasets
---
## π οΈ How to Run
1. **Dependencies**:
```python
pandas==1.5.3
numpy==1.24.3
scikit-learn==1.2.2
jupyter==1.0.0
```
2. **Execution Steps**:
```bash
# Clone repository
git clone https://github.com/username/cardio-prediction.git
# Launch Jupyter notebook
jupyter notebook Ornella_Gigante_lab4.ipynb
```
---
## π Future Improvements
- π§ Experiment with deeper tree structures and pruning
- π€ Test ensemble methods (Random Forest, XGBoost)
- 𧬠Incorporate additional medical biomarkers
- π Implement cross-validation for robust evaluation
- βοΈ Address potential class imbalance
*Developed using Python 3.9 and Jupyter Lab environment.*
---
This project demonstrates foundational ML workflow implementation for healthcare diagnostics, with potential real-world applications in early disease detection.