https://github.com/djdhairya/brain-stroke-prediction
https://github.com/djdhairya/brain-stroke-prediction
encoding flask hot-encoding joblib machine-learning matplotlib onehot-encoding pandas pipeline ploty roc-auc-curve sckiit-learn seaborn sklearn
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/djdhairya/brain-stroke-prediction
- Owner: djdhairya
- License: mit
- Created: 2024-12-26T05:39:51.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-26T05:43:59.000Z (10 months ago)
- Last Synced: 2025-05-18T20:07:45.954Z (5 months ago)
- Topics: encoding, flask, hot-encoding, joblib, machine-learning, matplotlib, onehot-encoding, pandas, pipeline, ploty, roc-auc-curve, sckiit-learn, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 550 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Brain Stroke Prediction
## Introduction
The **Brain Stroke Prediction** application predicts whether a person is at risk of having a stroke based on input parameters like age, medical history, and lifestyle factors. This project uses machine learning techniques for the prediction.---
## Project Description
The prediction model leverages multiple features such as gender, age, hypertension, heart disease, and more. It uses a combination of data preprocessing, feature engineering, and a machine learning ensemble classifier for robust predictions. Users can input their details via a web interface, and the system will display the likelihood of a stroke.---
## Installation
### Prerequisites
- Required Python libraries:
```bash
pip install flask pandas numpy matplotlib seaborn plotly scikit-learn joblib
```### Steps
1. Clone or download the project repository.
2. Ensure the `model.joblib` file is available at the specified path:
3. Run the Flask application:
```bash
python app.py
```
4. Open the application in your browser at `http://127.0.0.1:5000`.---
## Usage
1. Open the application in a web browser.
2. Enter the required details in the input form.
3. Submit the form to receive the prediction result.---
## Input Fields
| **Field** | **Description** | **Example** |
|-------------------------|------------------------------------------------------------|------------------------|
| `gender` | Gender of the individual | `male`, `female` |
| `age` | Age in years | `45` |
| `hypertension` | 1 if the individual has hypertension, 0 otherwise | `1` |
| `heart_disease` | 1 if the individual has a heart disease, 0 otherwise | `0` |
| `ever_married` | `yes` if ever married, `no` otherwise | `yes` |
| `work_type` | Type of work: `Private`, `Govt_job`, `children`, etc. | `Private` |
| `Residence_type` | Type of residence: `Urban` or `Rural` | `Urban` |
| `avg_glucose_level` | Average glucose level in the blood (mg/dL) | `105.5` |
| `bmi` | Body Mass Index | `22.4` |
| `smoking_status` | Smoking habit: `smokes`, `never smoked`, or `formerly smoked` | `never smoked` |---
## Output
The application predicts whether the person is:
- **Likely** to have a stroke.
- **Not Likely** to have a stroke.---
## Training Details
### Libraries Used
The following libraries were used for data processing, visualization, and model training:
- **Data Handling and Visualization**:
- `pandas`, `numpy`: Data manipulation and analysis.
- `matplotlib`, `seaborn`: Static data visualization.
- `plotly`: Interactive data visualization.- **Machine Learning**:
- `sklearn.preprocessing`:
- `OneHotEncoder`: For encoding categorical features.
- `OrdinalEncoder`: For ordinal data encoding.
- `sklearn.pipeline`:
- `Pipeline`: For chaining preprocessing and model steps.
- `sklearn.compose`:
- `ColumnTransformer`: For applying different preprocessing to column subsets.
- `sklearn.ensemble`:
- `VotingClassifier`: For combining multiple classifiers for robust predictions.
- `sklearn.metrics`:
- `roc_auc_score`, `roc_curve`, `auc`: For performance evaluation.
- `PredictionErrorDisplay`: For visualizing prediction errors.### Model Workflow
1. **Data Preprocessing**:
- Handled missing values.
- Encoded categorical features using `OneHotEncoder` and `OrdinalEncoder`.
- Normalized numeric features.2. **Model Training**:
- Used an ensemble `VotingClassifier` to combine multiple models.
- Optimized hyperparameters for best performance.
- Evaluated the model using metrics such as AUC-ROC.3. **Final Model**:
- Saved using `joblib` for deployment.---
## File Structure
```
Brain Stroke Prediction/
├── app.py
├── model/
│ └── model.joblib
├── templates/
│ ├── index.html
│ └── result.html
├── README.md
├── data/
│ └── datasets.csv
└── brain_stroke_prediction.ipynb
```

---