Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-

Sentiment analysis, also known as opinion mining, is a process that involves analyzing text to determine the sentiment expressed, such as positive, negative, or neutral.
https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-

nlp nlp-library nlp-machine-learning sentiment-analysis

Last synced: 3 days ago
JSON representation

Sentiment analysis, also known as opinion mining, is a process that involves analyzing text to determine the sentiment expressed, such as positive, negative, or neutral.

Host: GitHub
URL: https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-
Owner: Nishant2018
Created: 2024-06-13T06:51:28.000Z (5 months ago)
Default Branch: main
Last Pushed: 2024-06-13T06:55:36.000Z (5 months ago)
Last Synced: 2024-06-13T09:57:35.637Z (5 months ago)
Topics: nlp, nlp-library, nlp-machine-learning, sentiment-analysis
Language: Jupyter Notebook
Homepage:
Size: 1010 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Sentiment Analysis

### Introduction

Sentiment analysis, also known as opinion mining, is the process of determining the sentiment or emotion expressed in a piece of text. It is widely used to analyze customer feedback, social media posts, reviews, and more to gauge public opinion.

### Key Concepts

### Text Preprocessing

Text preprocessing involves cleaning and preparing text data for analysis. Common steps include:

- **Tokenization**: Splitting text into individual words or tokens.

- **Lowercasing**: Converting all text to lowercase.

- **Removing Punctuation and Stopwords**: Filtering out non-essential words and punctuation.

- **Lemmatization/Stemming**: Reducing words to their base or root form.

### Tokenization Example

```python

from nltk.tokenize import word_tokenize

text = "I love this product! It's amazing."

tokens = word_tokenize(text.lower())

print(tokens)

```

### Removing Stopwords Example

```python

from nltk.corpus import stopwords

tokens = ["i", "love", "this", "product", "it's", "amazing"]

stop_words = set(stopwords.words('english'))

filtered_tokens = [word for word in tokens if word not in stop_words]

print(filtered_tokens)

```

### Feature Extraction

Converting text into numerical representations that can be used by machine learning algorithms. Common techniques include:

- **Bag of Words (BoW)**: Representing text as a set of word frequencies.

- **TF-IDF**: Weighing words by their importance in the document and corpus.

#### TF-IDF Example

```python

from sklearn.feature_extraction.text import TfidfVectorizer

corpus = ["I love this product.", "I hate this product.", "This product is okay."]

vectorizer = TfidfVectorizer()

X = vectorizer.fit_transform(corpus)

print(X.toarray())

```

### Model Training

Train a machine learning model on preprocessed and vectorized text data. Popular algorithms include:

- Support Vector Machines (SVM)

- Naive Bayes

- Logistic Regression

- Neural Networks

### Naive Bayes Example

```python

from sklearn.model_selection import train_test_split

from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score

# Example corpus and labels

corpus = ["I love this product.", "I hate this product.", "This product is okay."]

labels = ['positive', 'negative', 'neutral']

# TF-IDF Vectorization

vectorizer = TfidfVectorizer()

X = vectorizer.fit_transform(corpus)

# Train-test split

X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.25, random_state=42)

# Training Naive Bayes model

model = MultinomialNB()

model.fit(X_train, y_train)

# Predictions

y_pred = model.predict(X_test)

# Evaluation

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')

```

## Applications

- **Customer Feedback Analysis**: Understanding customer opinions and sentiments from reviews and feedback.

- **Social Media Monitoring**: Analyzing public sentiment on social media platforms.

- **Market Research**: Gauging consumer sentiment towards products or services.

- **Brand Monitoring**: Tracking brand reputation and public perception.

## Conclusion

Sentiment analysis using NLP and machine learning is a powerful tool for extracting insights from textual data. By leveraging these techniques, it is possible to build effective models that can accurately classify sentiments, providing valuable information for various applications.

This Markdown content provides a clear and concise overview of sentiment analysis, including key concepts, preprocessing steps, feature extraction techniques, model training with examples, and applications.