Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tanyakuznetsova/amazon-handmade-reviews-23-sentiment-and-ner
Comparison of AWS Comprehend and SpaCy on a subset of the Amazon Handmade reviews for sentiment analysis and NER
https://github.com/tanyakuznetsova/amazon-handmade-reviews-23-sentiment-and-ner
amazon-api amazon-reviews amazon-reviews-sentiment-analysis aws-boto3 aws-comprehend aws-comprehend-nlp named-entity-recognition natural-language-processing ner sentiment-analysis spacy spacy-nlp spacy-nlp-ner
Last synced: about 2 months ago
JSON representation
Comparison of AWS Comprehend and SpaCy on a subset of the Amazon Handmade reviews for sentiment analysis and NER
- Host: GitHub
- URL: https://github.com/tanyakuznetsova/amazon-handmade-reviews-23-sentiment-and-ner
- Owner: tanyakuznetsova
- Created: 2024-05-30T13:23:51.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-30T17:31:48.000Z (8 months ago)
- Last Synced: 2024-12-18T18:12:59.793Z (about 2 months ago)
- Topics: amazon-api, amazon-reviews, amazon-reviews-sentiment-analysis, aws-boto3, aws-comprehend, aws-comprehend-nlp, named-entity-recognition, natural-language-processing, ner, sentiment-analysis, spacy, spacy-nlp, spacy-nlp-ner
- Language: Jupyter Notebook
- Homepage:
- Size: 1.04 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Amazon-Handmade-Reviews-23-Sentiment-and-NER
## Overview
This project compares the performance of two NLP systems, AWS Comprehend and SpaCy, on a subset of the Amazon Handmade reviews dataset. The tasks include sentiment analysis and named entity recognition (NER).## Data
The dataset used is a subset of the Amazon Reviews 2023 dataset collected by Professor Julian McAuley and his team at UCSD, containing 664,162 reviews of Amazon Handmade items.
Source: [Amazon Reviews 2023](https://amazon-reviews-2023.github.io/)
Subset: 664,162 reviews of Amazon Handmade items## NLP Tools
**AWS Comprehend**: Amazon's proprietary NLP service**SpaCy**: Open-source NLP library
eng_spacysentiment: SpaCy-based sentiment analysis extension## Key Findings
- **Sentiment Analysis**:
AWS Comprehend generally captures sentiment nuances better, especially in short reviews.
SpaCy tends to struggle and misclassify short (one-two words) reviews in particular.- **NER**:
SpaCy provides more detailed entity categorization (e.g., distinguishing between CARDINAL, ORDINAL, MONEY), whereas AWS Comprehend uses broader generalizations (e.g., QUANTITY for any numeric expressions).## Key Visualizations
![AWS and SpaCy Sentiment Breakdown](AWS_and_SpaCy_sentiment.png)
![AWS and SpaCy Named Entity Recognition](AWS_and_SpaCy_wordcloud.png)## Contact
For any questions, please [get in touch!](mailto:[email protected]).