Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shanmukhsrisaivedullapalli/smsspamclassification
SMSSpamClassification is a machine learning project aimed at accurately classifying SMS messages as either spam or ham (non-spam). It employs natural language processing techniques to extract relevant features from the text data and utilizes various classification algorithms to build a robust spam detection model.
https://github.com/shanmukhsrisaivedullapalli/smsspamclassification
jupyter-notebook numpy pandas pickle python3 sklearn spam-classification spam-detection
Last synced: 6 days ago
JSON representation
SMSSpamClassification is a machine learning project aimed at accurately classifying SMS messages as either spam or ham (non-spam). It employs natural language processing techniques to extract relevant features from the text data and utilizes various classification algorithms to build a robust spam detection model.
- Host: GitHub
- URL: https://github.com/shanmukhsrisaivedullapalli/smsspamclassification
- Owner: shanmukhsrisaivedullapalli
- Created: 2024-07-22T05:58:37.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-07-22T06:17:26.000Z (6 months ago)
- Last Synced: 2024-11-19T12:47:01.819Z (2 months ago)
- Topics: jupyter-notebook, numpy, pandas, pickle, python3, sklearn, spam-classification, spam-detection
- Language: Jupyter Notebook
- Homepage:
- Size: 389 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## GitHub Repository: SMSSpamClassification
**SMSSpamClassification** is a machine learning project focused on categorizing SMS messages as spam or ham (non-spam). Leveraging the SMSSpamCollection dataset, the project employs Natural Language Processing (NLP) techniques, specifically TF-IDF vectorization, to extract meaningful features from the text data. A Logistic Regression model is trained on these features to build a robust spam detection classifier.
This repository houses the code for an SMS spam classification project. It encompasses data preprocessing, feature engineering using TF-IDF, model training with Logistic Regression, and model evaluation.
**Project Structure**
```
SMSSpamClassification/
├── data/
│ └── SMSSpamCollection.csv
├── models/
│ ├── feature_extraction.pkl
│ └── spam_detection_model.pkl
├── notebooks/
│ └── SMSSpamClassification.ipynb
├── requirements.txt
└── README.md
```**Data**
* The dataset utilized for this project is the publicly accessible SMS Spam Collection dataset.
* Raw data is stored in the `data` directory.**Notebooks**
* **SMSSpamClassification.ipynb**: Contains the entire workflow, including data exploration, preprocessing, feature extraction using TF-IDF, model training with Logistic Regression, and model evaluation.
**Models**
* **feature_extraction.pkl**: Saved TF-IDF vectorizer for future use.
* **spam_detection_model.pkl**: Trained Logistic Regression model for spam classification.**requirements.txt**: Lists necessary Python libraries for project execution.
**Installation**
1. Clone the repository:
```bash
git clone https://github.com/shanmukhsrisaivedullapalli/SMSSpamClassification.git
```
2. Create a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```**Usage**
1. Run the `SMSSpamClassification.ipynb` notebook to execute the entire project workflow.
2. The trained model and feature extractor are saved for potential future use.**Contributing**
Contributions are welcome! You can enhance the project by:
* Implementing different NLP techniques or feature engineering methods.
* Experimenting with various classification algorithms.
* Improving model performance through hyperparameter tuning.
* Enhancing the project's documentation.