https://github.com/exido-rio/spam-classifier
https://github.com/exido-rio/spam-classifier
Last synced: 6 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/exido-rio/spam-classifier
- Owner: Exido-Rio
- Created: 2025-07-10T08:50:24.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-07-10T09:07:00.000Z (7 months ago)
- Last Synced: 2025-07-10T17:13:52.920Z (7 months ago)
- Language: Jupyter Notebook
- Size: 625 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Spam-Classifier
A concise Naive Bayes classifier to detect unsolicited or spam messages.
## Overview
- **Purpose**: Quickly classify messages as spam or ham using probabilistic methods.
- **Data**: Utilizes the SMS Spam Collection (5,574 labeled text messages).
- **Algorithm**: Naive Bayes with bag-of-words (CountVectorizer) for feature extraction.
## How It Works
1. **Data Preparation**
- Load dataset, split into training and test sets.
- Clean and tokenize text.
2. **Feature Extraction**
- Convert words and common word pairs (unigrams/bigrams) into numeric features.
3. **Model Training**
- Train Naive Bayes using training set.
- Evaluate performance on test data (accuracy, precision, recall, etc.).
## Usage
1. Clone the repo:
```bash
git clone https://github.com/Exido-Rio/Spam-Classifier.git
cd Spam-Classifier
## Files
- Tester.ipynb - Used to load and test the model
- Trainer.ipynb - Used to load dataset , clean and extract feature then train the model