Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ksdkamesh99/spam-classifier

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
https://github.com/ksdkamesh99/spam-classifier

bag-of-words count-vectorizer decision-tree-classifier embeddings logistic-regression lstm-neural-networks multinomial-naive-bayes naive-bayes-classifier porter-stemmer sms-spam-detection support-vector-machines tfidf-vectorizer wordnetlemmatizer

Last synced: about 2 months ago
JSON representation

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.

Awesome Lists containing this project

README

        

# Spam-Classifier
[![forthebadge](https://forthebadge.com/images/badges/built-with-love.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/made-with-python.svg)](https://forthebadge.com)[![forthebadge](https://forthebadge.com/images/badges/its-not-a-lie-if-you-believe-it.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/built-by-developers.svg)](https://forthebadge.com)



Logo

## πŸ“Œ Introduction:-

A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer.
It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.

## βœ”βŒAccuracy βŒβœ”:-
| Text Preprocessing Type | Logistic Regression | Multinomial NB | Support Vector Machine | Decision Tree |
|--------------------------------------|---------------------|----------------|-------------------------|---------------|
| TFIDF Vectorizer + PorterStemmer | 96.68% | 97.30% | 98.47% | 96.68% |
| CountVectorizer + PorterStemmer | 98.65% | 98.56% | 98.74% | 97.84% |
| CountVectorizer + WordnetLemmatizer | 98.56% | 98.29% | 98.38% | 97.75% |
| TFIDF Vectorizer + WordnetLemmatizer | 96.41% | 97.48% | 98.47% | 96.86% |

## WorkFlow:-
![Workflow of SMS spam Classifer](workflow.gif)

## 🏁 Datasets Used:-
* The dataset used is SMS Spam Dataset created by UCI Machine Learning.This dataset is downloaded in kaggle.You can download it [here](https://www.kaggle.com/uciml/sms-spam-collection-dataset/download).
* Reference for this dataset can be found [here](http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/)
## πŸ“§Contact:-
For any kind of suggesstions/ help in models code Please mail me at [email protected].

## πŸ“œ LICENSE
[MIT](https://github.com/ksdkamesh99/Spam-Classifier/blob/master/LICENSE)