An open API service indexing awesome lists of open source software.

https://github.com/gehad-ahmed30/natural-language-processing

This repository showcases a collection of practical NLP projects, ranging from sentiment analysis to spam detection. The implementations leverage both Machine Learning (ML) and Deep Learning (DL) approaches to explore various natural language processing tasks and techniques.
https://github.com/gehad-ahmed30/natural-language-processing

deep-learning lstm machine-learning naive-bayes nlp nltk preprocessing stopwords tokenization

Last synced: 8 months ago
JSON representation

This repository showcases a collection of practical NLP projects, ranging from sentiment analysis to spam detection. The implementations leverage both Machine Learning (ML) and Deep Learning (DL) approaches to explore various natural language processing tasks and techniques.

Awesome Lists containing this project

README

          

# Restaurant Reviews โ€“ Sentiment Analysis (NLP Case Study)

This project performs sentiment analysis on restaurant reviews using Natural Language Processing (NLP) and machine learning. The goal is to classify whether a review is **positive (Liked = 1)** or **negative (Liked = 0)**.

---

## ๐Ÿง  Project Workflow

1. **Importing Data & Libraries**
2. **Text Preprocessing (NLTK):**
- Lowercasing
- Removing punctuation
- Removing stopwords
- Stemming (Porter Stemmer)
3. **Vectorization:**
- Using `CountVectorizer`
4. **Model Building:**
- Multinomial Naive Bayes Classifier
5. **Model Evaluation:**
- Accuracy, Confusion Matrix, Classification Report
6. **Model Saving:**
- Exported using `joblib`

---

## ๐Ÿ“Š Dataset

- Source: [Kaggle โ€“ Restaurant Reviews (TSV)](https://www.kaggle.com/datasets/maher3id/restaurant-reviewstsv)
- Format: `.tsv` file
- Records: 1000 reviews
- Columns:
- `Review` (text)
- `Liked` (binary label)

---
# ๐Ÿ“ง Spam Detection โ€“ Deep Learning (NLP Case Study)

This project detects whether a given SMS message is **Spam** or **Not Spam (Ham)** using Deep Learning and NLP.
It applies preprocessing, tokenization, and an LSTM-based model to classify messages.

---

## ๐Ÿง  Project Workflow

### 1. ๐Ÿ“ฅ Importing Data & Libraries
- Pandas, NumPy
- NLTK for text preprocessing
- TensorFlow / Keras for DL model
- Matplotlib / Seaborn for visualization

---

### 2. ๐Ÿงน Data Cleaning & Preprocessing
Steps applied on the text:
- Lowercasing
- Removing punctuation & numbers
- Removing stopwords
- Tokenization
- Padding sequences

---

### 3. ๐Ÿ—ƒ๏ธ Dataset Info
- **Source**: [Kaggle โ€“ SMS Spam Collection Dataset](https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset)
- **Size**: 5572 messages
- **Classes**:
- `ham` โ†’ Not Spam
- `spam` โ†’ Spam

---