https://github.com/sahilmate/ebm-breast-cancer-classifier
This repository implements an Explainable Boosting Machine (EBM) model for breast cancer classification using scikit-learn and interpret. The project includes data preprocessing, model training, accuracy evaluation, and feature importance visualization.
https://github.com/sahilmate/ebm-breast-cancer-classifier
breast-cancer-classification data-visualization explainable-boosting-machine feature-importance interpret machine-learning scikit-learn
Last synced: 8 months ago
JSON representation
This repository implements an Explainable Boosting Machine (EBM) model for breast cancer classification using scikit-learn and interpret. The project includes data preprocessing, model training, accuracy evaluation, and feature importance visualization.
- Host: GitHub
- URL: https://github.com/sahilmate/ebm-breast-cancer-classifier
- Owner: sahilmate
- Created: 2024-07-24T19:28:40.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-07-24T19:51:36.000Z (about 1 year ago)
- Last Synced: 2024-07-24T22:29:18.217Z (about 1 year ago)
- Topics: breast-cancer-classification, data-visualization, explainable-boosting-machine, feature-importance, interpret, machine-learning, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 315 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Explainable Boosting Machine (EBM) Breast Cancer Classifier
## Project Overview
This project implements a breast cancer classification model using the Explainable Boosting Machine (EBM) algorithm. EBM is a type of interpretable machine learning model that provides both high accuracy and insights into feature importance. This model is trained on the Breast Cancer dataset, which is used to predict the presence of breast cancer based on various features.
### Key Features:
- **Model Training**: Uses EBM for classification.
- **Model Evaluation**: Evaluates model accuracy on test data.
- **Feature Importance**: Visualizes feature importance to understand key factors influencing predictions.## Dataset
The model uses the Breast Cancer dataset, which is included with the `scikit-learn` library. This dataset consists of measurements of various features related to breast cancer tumors and is used for binary classification.
## Installation
To run this project, you'll need Python and several libraries. It is recommended to use a virtual environment to manage dependencies.
### Setup Instructions
1. **Clone the Repository**
```bash
git clone https://github.com/sahilmate/ebm-breast-cancer-classifier.git
cd ebm-breast-cancer-classifier
```2. **Create and Activate a Virtual Environment**
```bash
python -m venv env
.\env\Scripts\Activate
```3. **Install Required Libraries**
```bash
pip install -r requirements.txt
```If `requirements.txt` is not present, manually install the required libraries:
```bash
pip install pandas scikit-learn interpret matplotlib numpy
```## Usage
1. **Run the Jupyter Notebook**
Open the Jupyter Notebook file included in the repository (`ebm-breast-cancer-classifier.ipynb`) and run the cells to execute the project.
2. **Run the Python Script**
Alternatively, you can run directly from the command line:
```bash
jupyter nbconvert --to notebook --execute ebm-breast-cancer-classifier.ipynb```
This will load the dataset, train the EBM model, make predictions, and display the model's accuracy and feature importance plot.
## Output
The script will produce the following outputs:
- **Accuracy**: The accuracy score of the model on the test dataset was : 0.9736842105263158
- **Feature Importance Plot**: A horizontal bar plot showing the importance of each feature in the model.
## ContributingContributions to this project are welcome. To contribute:
1. **Fork the Repository**
Click the "Fork" button on the top right corner of this repository page.
2. **Create a New Branch**
```bash
git checkout -b your-branch-name
```3. **Make Your Changes**
Edit or add files as necessary.
4. **Commit Your Changes**
```bash
git add .
git commit -m "Your descriptive commit message"
```5. **Push to Your Fork**
```bash
git push origin your-branch-name
```6. **Create a Pull Request**
Go to the original repository and click on "New Pull Request." Select your branch and submit the pull request.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgements
- **EBM**: [InterpretML](https://interpret.ml/) for the Explainable Boosting Machine implementation.
- **Dataset**: Scikit-learn's Breast Cancer dataset.