Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gurpreet0022/nlp_exploration

This repository explores various Natural Language Processing (NLP) techniques using the NLTK library in Python. It demonstrates these techniques on a sample dataset and performs sentiment analysis on movie reviews.
https://github.com/gurpreet0022/nlp_exploration

beginner-friendly nlp nlp-machine-learning nltk scikit-learn

Last synced: 21 days ago
JSON representation

This repository explores various Natural Language Processing (NLP) techniques using the NLTK library in Python. It demonstrates these techniques on a sample dataset and performs sentiment analysis on movie reviews.

Awesome Lists containing this project

README

        

# NLP Techniques and Sentiment Analysis with NLTK

This notebook explores various Natural Language Processing (NLP) techniques using the NLTK library in Python. It demonstrates these techniques on a sample dataset and performs sentiment analysis on movie reviews.

## Tasks Performed

### NLP Techniques

1. **Data Loading and Preprocessing:** Loads a sample dataset for demonstrating NLP techniques.
2. **Tokenization:** Splits text into individual words.
3. **Stop Word Removal:** Removes common words that don't contribute to analysis.
4. **Stemming:** Reduces words to their base form.
5. **Part-of-Speech Tagging:** Assigns grammatical tags to words.
6. **Named Entity Recognition (NER):** Identifies named entities in the text.
7. **Lemmatization:** Similar to stemming, but finds the dictionary form of words.
8. **Corpora Exploration:** Accesses and explores text collections from NLTK's corpora.
9. **WordNet Exploration:** Analyzes word relationships using WordNet.

### Sentiment Analysis

1. **Movie Reviews Dataset:** Uses the NLTK movie reviews dataset for sentiment analysis.
2. **Feature Extraction:** Extracts relevant features (words) from the reviews.
3. **Naive Bayes Classifier:** Trains a Naive Bayes classifier to predict sentiment (positive/negative).
4. **Accuracy Evaluation:** Evaluates the classifier's accuracy on a test set.
5. **Saving the Classifier:** Saves the trained classifier using Pickle for later use.

## Libraries Used

- NLTK
- Pandas
- NumPy
- Pickle

## Datasets

- A sample dataset for demonstrating NLP techniques.
- The NLTK movie reviews dataset for sentiment analysis.

## Usage

1. Make sure you have the required libraries installed.
2. Upload the sample dataset to your Google Colab environment (if needed).
3. Run the notebook cells sequentially.