https://github.com/04bhavyaa/sms-spam-classification-system
A Machine Learning project that identifies whether a given message is spam or not. It uses Natural Language Processing (NLP) techniques (Stemming and TF-IDF Vectorization) for text transformation and a trained Multinomial Naive Bayes Classifier for predictions.
https://github.com/04bhavyaa/sms-spam-classification-system
bernoulli-naive-bayes nlp-machine-learning nltk-library spam-classification stemming streamlit tfidf-vectorizer
Last synced: 4 months ago
JSON representation
A Machine Learning project that identifies whether a given message is spam or not. It uses Natural Language Processing (NLP) techniques (Stemming and TF-IDF Vectorization) for text transformation and a trained Multinomial Naive Bayes Classifier for predictions.
- Host: GitHub
- URL: https://github.com/04bhavyaa/sms-spam-classification-system
- Owner: 04bhavyaa
- Created: 2024-12-26T15:30:46.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-26T17:20:00.000Z (10 months ago)
- Last Synced: 2025-02-28T06:34:17.030Z (7 months ago)
- Topics: bernoulli-naive-bayes, nlp-machine-learning, nltk-library, spam-classification, stemming, streamlit, tfidf-vectorizer
- Language: Jupyter Notebook
- Homepage:
- Size: 1.89 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## SMS Spam Classification System
The Email/SMS Spam Classifier is a machine learning project that identifies whether a given message is spam or not. It uses Natural Language Processing (NLP) techniques (Stemming and TF-IDF Vectorization) for text transformation and a trained Bernoulli Naive Bayes Classifier for predictions.
## Directory Structure:
```
Directory structure:
└── 04bhavyaa-sms-spam-classification-system/
├── artifacts/
│ ├── vectorizer.pkl
│ ├── model.pkl
│ └── spam.csv
├── app.py
├── sms-spam-classification.ipynb
├── requirements.txt
├── nltk.txt
└── README.md
```
### Key Features
- Input a message through the user interface.
- Classify the message as Spam or Not Spam.
- Built with Streamlit for the web interface.
### How It Works
1. Input Transformation:
- Converts the input message to lowercase.
- Removes stopwords, punctuation, and non-alphanumeric characters.
- Stems the words to their root forms using the Porter Stemmer.
2. Vectorization:
- The transformed text is vectorized using a pre-trained TfidfVectorizer.
3. Prediction:
- The vectorized text is fed into a pre-trained Multinomial Naive Bayes Model.
- Outputs whether the message is Spam or Not Spam.
### Images:

