https://github.com/suvroneel/spam-email-and-sms-classifier
It’s an E2E ML project to filter spam msgs by using naive bayes classifier ✨💖
https://github.com/suvroneel/spam-email-and-sms-classifier
google-sheets-api machine-learning multinomial-naive-bayes naive-bayes-classifier natural-language-processing pandas python3
Last synced: 3 months ago
JSON representation
It’s an E2E ML project to filter spam msgs by using naive bayes classifier ✨💖
- Host: GitHub
- URL: https://github.com/suvroneel/spam-email-and-sms-classifier
- Owner: Suvroneel
- Created: 2024-01-25T16:32:49.000Z (over 1 year ago)
- Default Branch: Version-2.1.0
- Last Pushed: 2025-03-28T06:04:42.000Z (3 months ago)
- Last Synced: 2025-04-10T00:57:48.025Z (3 months ago)
- Topics: google-sheets-api, machine-learning, multinomial-naive-bayes, naive-bayes-classifier, natural-language-processing, pandas, python3
- Language: Jupyter Notebook
- Homepage: https://spam-email-and-sms-classifier-xghzt3pj3bvd5ltzqp6rs8.streamlit.app/
- Size: 2.03 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
🌟Spam Email/SMS Classifier | End-to-End ML Pipeline:
==================================================
Tech Stack:
🐍 Python | 🤖 Scikit-learn | 📊 Pandas/NLTK | ☁️ Streamlit/Google Sheets APIKey Features:
✅ Text Preprocessing: Lowercasing, tokenization, special char removal, stemming
✅ Advanced NLP: TF-IDF vectorization + Multinomial Naïve Bayes
✅ Visual EDA: Word clouds for spam vs. ham patterns
✅ Live Deployment: Streamlit web app + Google Sheets integration for user input tracking✅ MLOps Ready: Pickle model serialization and scikit-learn pipelines
Business Value:
🛡️ Spam Filtering: Blocks 99% of unwanted messages
📈 Data Collection: Logs predictions for model improvement
🔮 Scalable: Pipeline adapts to new spam patterns🚧 Future Improvements
1. Deep Learning Upgrade
:construction: CNN Integration:Implement character-level CNN models (e.g., Char-CNN) for context-aware spam detection
Compare performance against current TF-IDF + Naïve Bayes pipeline
Updates
=======================================================
📍Update 1 - Now user's input data and their respective output will be recorded and will be used for futher training and testings ✔✔📍Update 2 - Minor fixes . Implemented speeling check for further checking ✔✔
📍Update 3 - The model is much more trained and a navbar along with a breath taking wallpaper is added