https://github.com/preethi2805/spam-mail-classification
https://github.com/preethi2805/spam-mail-classification
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/preethi2805/spam-mail-classification
- Owner: Preethi2805
- Created: 2025-01-28T04:18:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-20T05:22:12.000Z (over 1 year ago)
- Last Synced: 2025-09-08T07:41:50.845Z (9 months ago)
- Language: Jupyter Notebook
- Size: 515 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 📧 Spam Mail Detection
Detect spam emails efficiently using **Logistic Regression**! This project demonstrates the power of machine learning in classifying emails as spam or ham. This system achieves high accuracy and ensures reliable predictions by leveraging text preprocessing and feature extraction.
---
## 🛠️ Features
- **Dataset**:
- **5,572 email messages** categorized as `spam` or `ham`.
- Preprocessing includes dropping irrelevant columns and filling missing values with empty strings.
- Labels encoded: `spam = 1`, `ham = 0`.
- **Feature Extraction**:
Utilizes **TF-IDF Vectorizer** to transform text data into numerical features.
- Removes stop words.
- Converts all text to lowercase for uniformity.
- **Model Training**:
Built using **Logistic Regression** to achieve robust email classification.
- **Training Accuracy**: `96.6%`
- **Testing Accuracy**: `96.2%`
- **Predictive System**:
Input an email message and the system will classify it as `spam` or `ham`.
---
## 📊 Workflow
1. **Data Cleaning**:
- Removed unnecessary columns.
- Replaced missing values.
2. **Feature Engineering**:
- Applied TF-IDF vectorization to convert text into numerical data.
3. **Model Training**:
- Trained Logistic Regression on preprocessed data.
4. **Prediction System**:
- Deployed a system to classify emails based on their text.
---
## 📂 Project Structure
```
spam-mail-detection/
├── spam_detection.py # Main script
├── spam.csv # Dataset
├── Spam_mail_detection.ipynb # Jupyter Notebook
├── README.md # Project documentation
```
---
## 🤝 Contributions
Contributions are always welcome! Feel free to fork the repo, raise issues, or submit pull requests.
---