https://github.com/ksdkamesh99/spam-classifier
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
https://github.com/ksdkamesh99/spam-classifier
bag-of-words count-vectorizer decision-tree-classifier embeddings logistic-regression lstm-neural-networks multinomial-naive-bayes naive-bayes-classifier porter-stemmer sms-spam-detection support-vector-machines tfidf-vectorizer wordnetlemmatizer
Last synced: about 1 month ago
JSON representation
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
- Host: GitHub
- URL: https://github.com/ksdkamesh99/spam-classifier
- Owner: ksdkamesh99
- License: mit
- Created: 2020-05-26T18:12:20.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-12-25T03:18:36.000Z (over 4 years ago)
- Last Synced: 2025-04-20T13:36:56.660Z (2 months ago)
- Topics: bag-of-words, count-vectorizer, decision-tree-classifier, embeddings, logistic-regression, lstm-neural-networks, multinomial-naive-bayes, naive-bayes-classifier, porter-stemmer, sms-spam-detection, support-vector-machines, tfidf-vectorizer, wordnetlemmatizer
- Language: Jupyter Notebook
- Homepage:
- Size: 510 KB
- Stars: 15
- Watchers: 2
- Forks: 11
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Spam-Classifier
[](https://forthebadge.com)
[](https://forthebadge.com)[](https://forthebadge.com)
[](https://forthebadge.com)## 📌 Introduction:-
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer.
It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.## ✔❌Accuracy ❌✔:-
| Text Preprocessing Type | Logistic Regression | Multinomial NB | Support Vector Machine | Decision Tree |
|--------------------------------------|---------------------|----------------|-------------------------|---------------|
| TFIDF Vectorizer + PorterStemmer | 96.68% | 97.30% | 98.47% | 96.68% |
| CountVectorizer + PorterStemmer | 98.65% | 98.56% | 98.74% | 97.84% |
| CountVectorizer + WordnetLemmatizer | 98.56% | 98.29% | 98.38% | 97.75% |
| TFIDF Vectorizer + WordnetLemmatizer | 96.41% | 97.48% | 98.47% | 96.86% |## WorkFlow:-
## 🏁 Datasets Used:-
* The dataset used is SMS Spam Dataset created by UCI Machine Learning.This dataset is downloaded in kaggle.You can download it [here](https://www.kaggle.com/uciml/sms-spam-collection-dataset/download).
* Reference for this dataset can be found [here](http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/)
## 📧Contact:-
For any kind of suggesstions/ help in models code Please mail me at [email protected].## 📜 LICENSE
[MIT](https://github.com/ksdkamesh99/Spam-Classifier/blob/master/LICENSE)