An open API service indexing awesome lists of open source software.

https://github.com/sabin74/spam_mail_detection

A machine learning project to classify SMS messages as Spam or Ham (Not Spam) using Natural Language Processing (NLP) techniques and Scikit-learn. This binary classification task uses the UCI SMS Spam Collection Dataset and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.
https://github.com/sabin74/spam_mail_detection

gridsearchcv nltk python scikit-learn smote sms-spam-detection uci-machine-learning

Last synced: about 2 months ago
JSON representation

A machine learning project to classify SMS messages as Spam or Ham (Not Spam) using Natural Language Processing (NLP) techniques and Scikit-learn. This binary classification task uses the UCI SMS Spam Collection Dataset and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.

Awesome Lists containing this project

README

          

# 📧 Spam Email Detection

A machine learning project to classify SMS messages as **Spam** or **Ham** (Not Spam) using **Natural Language Processing (NLP)** techniques and **Scikit-learn**. This binary classification task uses the **UCI SMS Spam Collection Dataset** and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.

---

## 🚀 Features

- Text preprocessing and cleaning
- Feature extraction using TF-IDF with n-grams
- Handling imbalanced classes using **SMOTE**
- Hyperparameter tuning with **GridSearchCV**
- Model comparison: Naive Bayes, SVM, Logistic Regression
- Save & load trained model and vectorizer
- Predict new SMS messages

---

## 🛠️ Tools & Libraries

- Python
- Pandas, NumPy
- Scikit-learn
- NLTK (for stopword removal)
- Imbalanced-learn (for SMOTE)
- Matplotlib, Seaborn (for visualization)
- Joblib (for model persistence)