Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mohammed-majid/logistic-binary-email-classification
Binary Classification of spam/ham emails
https://github.com/mohammed-majid/logistic-binary-email-classification
binary-classification logistic-regression scikit-learn
Last synced: about 5 hours ago
JSON representation
Binary Classification of spam/ham emails
- Host: GitHub
- URL: https://github.com/mohammed-majid/logistic-binary-email-classification
- Owner: Mohammed-Majid
- Created: 2024-05-30T07:59:38.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-30T08:12:52.000Z (8 months ago)
- Last Synced: 2024-11-16T12:12:45.173Z (2 months ago)
- Topics: binary-classification, logistic-regression, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 21.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Email Binary Classification using Logistic Regression
This repository contains code for a binary classification task: predicting whether an email is spam (1) or ham (0) using logistic regression.## Overview
The main script email_classification.py demonstrates the process of:- Pre-processing: Loading the dataset, checking for missing values, and splitting it into training and testing sets.
- Feature Extraction: Utilizing TF-IDF Vectorization to convert text data into numerical features.
- Modeling: Training a logistic regression model on the extracted features.
- Evaluation: Assessing the model's performance on both training and testing data, including accuracy, confusion matrix, and classification report.
- Prediction: Accepting user input (email text) and predicting whether it's spam or ham.## Usage
- Clone the Repository- Install Dependencies: Make sure you have the necessary dependencies installed. You can install them using pip:
```
pip install numpy pandas scikit-learn seaborn
```
- Run the Script- Input Email: When prompted, enter an email text to classify it as spam or ham.
## Dataset
The dataset email_classification.csv contains email texts labeled as spam or ham.## Requirements
- Python 3.x
- numpy
- pandas
- scikit-learn
- seaborn
## Project Structure
- Model.ipynb: The file that contains the script.
- email_classification.csv: The dataset used to train and test the model.