Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ksdkamesh99/spam-classifier
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
https://github.com/ksdkamesh99/spam-classifier
bag-of-words count-vectorizer decision-tree-classifier embeddings logistic-regression lstm-neural-networks multinomial-naive-bayes naive-bayes-classifier porter-stemmer sms-spam-detection support-vector-machines tfidf-vectorizer wordnetlemmatizer
Last synced: about 2 months ago
JSON representation
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
- Host: GitHub
- URL: https://github.com/ksdkamesh99/spam-classifier
- Owner: ksdkamesh99
- License: mit
- Created: 2020-05-26T18:12:20.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-12-25T03:18:36.000Z (about 4 years ago)
- Last Synced: 2024-11-02T12:33:58.432Z (2 months ago)
- Topics: bag-of-words, count-vectorizer, decision-tree-classifier, embeddings, logistic-regression, lstm-neural-networks, multinomial-naive-bayes, naive-bayes-classifier, porter-stemmer, sms-spam-detection, support-vector-machines, tfidf-vectorizer, wordnetlemmatizer
- Language: Jupyter Notebook
- Homepage:
- Size: 510 KB
- Stars: 15
- Watchers: 2
- Forks: 9
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Spam-Classifier
[![forthebadge](https://forthebadge.com/images/badges/built-with-love.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/made-with-python.svg)](https://forthebadge.com)[![forthebadge](https://forthebadge.com/images/badges/its-not-a-lie-if-you-believe-it.svg)](https://forthebadge.com)
[![forthebadge](https://forthebadge.com/images/badges/built-by-developers.svg)](https://forthebadge.com)## π Introduction:-
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer.
It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.## ββAccuracy ββ:-
| Text Preprocessing Type | Logistic Regression | Multinomial NB | Support Vector Machine | Decision Tree |
|--------------------------------------|---------------------|----------------|-------------------------|---------------|
| TFIDF Vectorizer + PorterStemmer | 96.68% | 97.30% | 98.47% | 96.68% |
| CountVectorizer + PorterStemmer | 98.65% | 98.56% | 98.74% | 97.84% |
| CountVectorizer + WordnetLemmatizer | 98.56% | 98.29% | 98.38% | 97.75% |
| TFIDF Vectorizer + WordnetLemmatizer | 96.41% | 97.48% | 98.47% | 96.86% |## WorkFlow:-
![Workflow of SMS spam Classifer](workflow.gif)## π Datasets Used:-
* The dataset used is SMS Spam Dataset created by UCI Machine Learning.This dataset is downloaded in kaggle.You can download it [here](https://www.kaggle.com/uciml/sms-spam-collection-dataset/download).
* Reference for this dataset can be found [here](http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/)
## π§Contact:-
For any kind of suggesstions/ help in models code Please mail me at [email protected].## π LICENSE
[MIT](https://github.com/ksdkamesh99/Spam-Classifier/blob/master/LICENSE)