Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-
Sentiment analysis, also known as opinion mining, is a process that involves analyzing text to determine the sentiment expressed, such as positive, negative, or neutral.
https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-
nlp nlp-library nlp-machine-learning sentiment-analysis
Last synced: 3 days ago
JSON representation
Sentiment analysis, also known as opinion mining, is a process that involves analyzing text to determine the sentiment expressed, such as positive, negative, or neutral.
- Host: GitHub
- URL: https://github.com/nishant2018/disneyland-reviews-nlp-sentiment-analysis-
- Owner: Nishant2018
- Created: 2024-06-13T06:51:28.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-06-13T06:55:36.000Z (5 months ago)
- Last Synced: 2024-06-13T09:57:35.637Z (5 months ago)
- Topics: nlp, nlp-library, nlp-machine-learning, sentiment-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 1010 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Sentiment Analysis
### Introduction
Sentiment analysis, also known as opinion mining, is the process of determining the sentiment or emotion expressed in a piece of text. It is widely used to analyze customer feedback, social media posts, reviews, and more to gauge public opinion.
### Key Concepts
### Text Preprocessing
Text preprocessing involves cleaning and preparing text data for analysis. Common steps include:
- **Tokenization**: Splitting text into individual words or tokens.
- **Lowercasing**: Converting all text to lowercase.
- **Removing Punctuation and Stopwords**: Filtering out non-essential words and punctuation.
- **Lemmatization/Stemming**: Reducing words to their base or root form.### Tokenization Example
```python
from nltk.tokenize import word_tokenize
text = "I love this product! It's amazing."
tokens = word_tokenize(text.lower())
print(tokens)
```### Removing Stopwords Example
```python
from nltk.corpus import stopwords
tokens = ["i", "love", "this", "product", "it's", "amazing"]
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word not in stop_words]
print(filtered_tokens)
```### Feature Extraction
Converting text into numerical representations that can be used by machine learning algorithms. Common techniques include:
- **Bag of Words (BoW)**: Representing text as a set of word frequencies.
- **TF-IDF**: Weighing words by their importance in the document and corpus.#### TF-IDF Example
```python
from sklearn.feature_extraction.text import TfidfVectorizer
corpus = ["I love this product.", "I hate this product.", "This product is okay."]
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(corpus)
print(X.toarray())
```### Model Training
Train a machine learning model on preprocessed and vectorized text data. Popular algorithms include:
- Support Vector Machines (SVM)
- Naive Bayes
- Logistic Regression
- Neural Networks### Naive Bayes Example
```python
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score# Example corpus and labels
corpus = ["I love this product.", "I hate this product.", "This product is okay."]
labels = ['positive', 'negative', 'neutral']# TF-IDF Vectorization
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(corpus)# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.25, random_state=42)# Training Naive Bayes model
model = MultinomialNB()
model.fit(X_train, y_train)# Predictions
y_pred = model.predict(X_test)# Evaluation
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
```## Applications
- **Customer Feedback Analysis**: Understanding customer opinions and sentiments from reviews and feedback.
- **Social Media Monitoring**: Analyzing public sentiment on social media platforms.
- **Market Research**: Gauging consumer sentiment towards products or services.
- **Brand Monitoring**: Tracking brand reputation and public perception.## Conclusion
Sentiment analysis using NLP and machine learning is a powerful tool for extracting insights from textual data. By leveraging these techniques, it is possible to build effective models that can accurately classify sentiments, providing valuable information for various applications.
This Markdown content provides a clear and concise overview of sentiment analysis, including key concepts, preprocessing steps, feature extraction techniques, model training with examples, and applications.