Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pratishtha-abrol/sentimentanalysis
Logistic Regression: A sentiment analysis case study
https://github.com/pratishtha-abrol/sentimentanalysis
logistic-regression nltk-python scikit-learn sentiment-analysis
Last synced: 18 days ago
JSON representation
Logistic Regression: A sentiment analysis case study
- Host: GitHub
- URL: https://github.com/pratishtha-abrol/sentimentanalysis
- Owner: pratishtha-abrol
- Created: 2020-06-01T15:32:15.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-06-01T17:05:35.000Z (over 4 years ago)
- Last Synced: 2024-10-24T20:07:12.490Z (2 months ago)
- Topics: logistic-regression, nltk-python, scikit-learn, sentiment-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 24.7 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# SentimentAnalysis
Logistic Regression: A sentiment analysis case study## Key Concepts
* Build and employ a logistic regression classifier using scikit-learn
* Clean and pre-process data
* Perform feature extraction with nltk
* Tune model hyperparameters and evaluate model accuracy## Dataset
IMDB movie reviews dataset
http://ai.stanford.edu/~amaas/data/sentiment
Contains 25000 positive and 25000 negative reviews
Contains at most reviews per movie
At least 7 stars out of 10 : positive (label = 1)
At most 4 stars out of 10 : negative (label = 0)
50/50 train/test split
Evaluation accuracy## Features: bag of 1-grams with TF-IDF values:
Extremely sparse feature matrix - close to 97% are zeros## Model: Logistic regression
p(y=1|x)=σ(w.Tx)
Linear classification model
Can handle sparse data
Fast to train
Weights can be interpreted