https://github.com/machinelearningprodigy/sms-spam-classiifer
This Email/SMS Spam Classifier is a machine learning-powered application built with Python and Streamlit to detect spam messages using Natural Language Processing (NLP) techniques. It preprocesses text, transforms it into numerical format, and predicts whether a message is Spam or Not Spam with a trained model. 🚀
https://github.com/machinelearningprodigy/sms-spam-classiifer
alogorithms embeddings machine-learning nlp-parsing pip python
Last synced: 16 days ago
JSON representation
This Email/SMS Spam Classifier is a machine learning-powered application built with Python and Streamlit to detect spam messages using Natural Language Processing (NLP) techniques. It preprocesses text, transforms it into numerical format, and predicts whether a message is Spam or Not Spam with a trained model. 🚀
- Host: GitHub
- URL: https://github.com/machinelearningprodigy/sms-spam-classiifer
- Owner: machinelearningprodigy
- Created: 2024-08-12T02:32:41.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-02-21T16:41:45.000Z (2 months ago)
- Last Synced: 2025-03-27T06:32:20.819Z (about 1 month ago)
- Topics: alogorithms, embeddings, machine-learning, nlp-parsing, pip, python
- Language: Python
- Homepage: https://spam-classiifer-24.streamlit.app/
- Size: 85.9 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ✉️📱 Email/SMS Spam Classifier
A **Machine Learning-powered** Email and SMS spam classification app built using **Python** and **Streamlit**. 🚀
## 🔍 Overview
This project classifies messages as either **"Spam"** or **"Not Spam"** based on their content. It utilizes **Natural Language Processing (NLP)** techniques to preprocess text before making predictions using a pre-trained model.
## 📂 Project Structure
The project consists of the following key components:
- **🎨 Streamlit Application** – A user-friendly interface for entering and classifying messages.
- **📝 Text Preprocessing** – Cleans and processes input text using NLP techniques.
- **🤖 Machine Learning Model** – A trained model that predicts whether a message is spam or not.
- **📊 Vectorizer** – Converts text data into a numerical format (Bag of Words) for processing.## ⚙️ How It Works
1. **📝 Input:** The user enters a message (SMS or Email) into the provided text box in the Streamlit app.
2. **🔄 Preprocessing:** The text undergoes:
- Lowercasing
- Tokenization
- Removal of non-alphanumeric characters & stopwords
- Stemming using the **Porter Stemmer**
3. **🔢 Vectorization:** The cleaned text is transformed into a numerical format using a **pre-trained CountVectorizer**.
4. **🤖 Prediction:** The vectorized text is fed into the model, which classifies it as either:
- **📩 Spam** – The message is likely spam.
- **✅ Not Spam** – The message is not spam.
5. **📌 Output:** The result is displayed on the Streamlit app.## 🚀 Example Usage
1. **Run the Streamlit app:**
```bash
streamlit run app.py
```
2. Enter a message in the provided text area.
3. Click the **"Predict"** button to check if the message is spam or not.### ✨ Try These Example Messages:
✅ `"Hey, are we still on for dinner tonight?"`
📩 `"Congratulations! You've won a free ticket to the Bahamas. Call now!"`## 📦 Dependencies
To install required dependencies, run:
```bash
pip install streamlit scikit-learn nltk
```Additionally, NLTK data packages `punkt` and `stopwords` need to be downloaded.
## 📁 Files
- **`app.py`** – The main script that runs the Streamlit app.
- **`model.pkl`** – The pre-trained machine learning model for spam classification.
- **`vectorizer.pkl`** – The pre-trained CountVectorizer for text transformation.## 🎯 Model Training
The model was trained on a labeled dataset of SMS messages using common text classification techniques, including:
- **Text Preprocessing** – Cleaning, tokenization, and stemming.
- **Vectorization** – Converting text into a numerical format using **Bag of Words**.
- **Model Selection** – A machine learning classifier was trained and optimized for accurate predictions.## 🎉 Conclusion
This project showcases the power of **NLP** and **machine learning** in identifying spam messages. The **Streamlit app** provides a simple interface for testing the classifier with real-world examples.
💡 **Feel free to explore, contribute, or extend this project. Happy coding!**
## 📜 License
This project is licensed under the **MIT License** – see the `LICENSE` file for details.