https://github.com/anthonyray/sentiment-analysis
Sentiment Analysis Algorithms for tweets
https://github.com/anthonyray/sentiment-analysis
Last synced: 11 months ago
JSON representation
Sentiment Analysis Algorithms for tweets
- Host: GitHub
- URL: https://github.com/anthonyray/sentiment-analysis
- Owner: anthonyray
- Created: 2015-06-14T17:30:00.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2017-05-05T09:07:50.000Z (about 9 years ago)
- Last Synced: 2024-04-15T09:15:25.416Z (about 2 years ago)
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 6
- Watchers: 2
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Tweets Sentiment Analysis
Sentiment analysis on Twitter data has attracted much attention recently. This program is a simple implementation of a sentiment classifier for a tweet.
Based on the *sentiwordnet* corpus, the classifier can distinguish if a tweet is **Positive**, **Negative**, or **Objective**.
## Usage
- To see answers to the TP : ```python tp.py```
- To see the results of the classification using the first version of the algorithm : ```python tpv1.py```
- To see the results of the classification using the second version of the algorithm : ```python tpv2.py```
- - To see the results of the classification using the third version of the algorithm : ```python tpv3.py```
## Requirements
You need the following libs :
- sklearn
- nltk
- numpy
## Processing pipeline
The processing pipeline for every tweet is the following :
### Algorithm v1
- Tweet preprocessing : removal of twitter-specific characters ( mentions : @, hashtags : #, retweets : RT)
- POS tagging : assign every word to its POS tag (Part Of Speech Tag)
- SentiSynset : For words such as adjective, words, nouns, and adverb, find the first sentisynset in the sentiwordnet corpus
- Assign a score for the tweet by aggregating every individual score from the sentisynsets
- Classify the tweet
### Algorithm v2 : taking negation and modifiers into account
- Tweet preprocessing : removal of twitter-specific characters ( mentions : @, hashtags : #, retweets : RT)
- POS tagging : assign every word to its POS tag (Part Of Speech Tag)
- SentiSynset : For words such as adjective, words, nouns, and adverb, find the first sentisynset in the sentiwordnet corpus
- Assign a score for the tweet by aggregating every individual score from the sentisynsets
- Modify the score to take care of negations and modifiers
- Classify the tweet
### Algorithm v3 : taking emojis into account
- Tweet preprocessing : removal of twitter-specific characters ( mentions : @, hashtags : #, retweets : RT)
- POS tagging : assign every word to its POS tag (Part Of Speech Tag)
- SentiSynset : For words such as adjective, words, nouns, and adverb, find the first sentisynset in the sentiwordnet corpus
- Assign a score for the tweet by aggregating every individual score from the sentisynsets
- Modify the score to take care of negations and modifiers
- Increase the positive or negative score by taking care of emojis
- Classify the tweet
# Classify tweets
```
python tpv1.py
```
```
python tpv2.py
```
```
python tpv3.py
```