https://github.com/vidursatija/quora-sentiment-analysis
sentiment analysis on quora answers
https://github.com/vidursatija/quora-sentiment-analysis
Last synced: 1 day ago
JSON representation
sentiment analysis on quora answers
- Host: GitHub
- URL: https://github.com/vidursatija/quora-sentiment-analysis
- Owner: vidursatija
- Created: 2017-08-05T05:14:09.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2017-08-05T05:17:47.000Z (about 8 years ago)
- Last Synced: 2025-10-07T10:49:10.661Z (1 day ago)
- Language: Python
- Homepage: http://nishchith.me/quora-sentiment-analysis/
- Size: 175 KB
- Stars: 0
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Quora-Sentiment-Analysis
------------------------------------------Used AFINN-111 to do sentiment-analysis on Quora answers and blogposts .
### Here's what i did
------------------------------------------
* Scraped answers and blogposts from [Quora](https://www.quora.com) and conveted them into .txt format .
* Ignored
1. [Stopwords](https://pythonspot.com/en/nltk-stop-words/)
2. Non-Alpha characters [ # , % , * , ! .. ]
3. Words which have no sentiment score defined [words , text , answer ... ]
* Used the sentimental [score-carrying words](./about_AFINN.txt) in [AFINN-111.txt](./AFINN-111.txt) to deduce the overall sentiment of an answer or a blogpost .### Samples
------------------------------------------* [What are some good habits to follow?- Nelson Wang](https://www.quora.com/What-are-some-good-habits-to-follow)
* [Build a solid career in tech without a CS major - Aman Goels](https://amangoel.quora.com/Build-a-solid-career-in-tech-without-a-CS-major)
### Usage
------------------------------------------```bash
python3 main.py [parameter]Parameters:
pick_answer - pick answer directly
pick_profile - pick answers from profile
```* Install requirements
```bash
pip install -r requirements.txt
```### Known Issues
------------------------------------------- [ ] No priority assigned to words occuring multiple times in the text . ( didn't get weights to assign :/ )
- [ ] Text format contains words[upvotes , views , Sitemap] which should be part of an answer
- [ ] Ignored words for which scores aren't defined under AFINN-111.txt .* Open for PR's and Issues .
### Enhancements
------------------------------------------* Some of the words which aren't included under AFINN-111.txt are ignored ( assigned a score of 0 ) in this case, however we can deduce score for those words . for example , if the word `hit` always appears in the proximity of words like sad or unhappy , then we can conclude that the word `hit` carries a negative sentiment . [supporting paper](http://www.cs.cmu.edu/%7Enasmith/papers/oconnor+balasubramanyan+routledge+smith.icwsm10.pdf)
### Note
------------------------------------------* The answers referenced under samples are picked up randomly .