https://github.com/krik8235/ml-project-sentiment-analysis
Build a machine learning model for sentimental analysis using NLTK.
https://github.com/krik8235/ml-project-sentiment-analysis
kaggle kaggle-dataset machine-learning machine-learning-algorithms nltk nltk-library numpy pandas python3 seaborn sklearn
Last synced: 6 months ago
JSON representation
Build a machine learning model for sentimental analysis using NLTK.
- Host: GitHub
- URL: https://github.com/krik8235/ml-project-sentiment-analysis
- Owner: krik8235
- Created: 2024-12-09T15:01:25.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-09T15:18:16.000Z (10 months ago)
- Last Synced: 2025-02-05T14:42:43.884Z (8 months ago)
- Topics: kaggle, kaggle-dataset, machine-learning, machine-learning-algorithms, nltk, nltk-library, numpy, pandas, python3, seaborn, sklearn
- Language: Python
- Homepage: https://medium.com/@kuriko-iwai/88bb17583358
- Size: 148 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Overview
Build a machine learning model for sentimental analysis using Twitter data.
- [Medium story](https://medium.com/@kuriko-iwai/88bb17583358?source=friends_link&sk=122d74d93dbacd7906330184b1254b80)
- Dataframe
![]()
## Usage
1. Download [sample dataset](https://www.kaggle.com/datasets/kazanova/sentiment140?resource=download) using [kagglehub](https://github.com/Kaggle/kagglehub)2. Pre-process the text data:
- **Lowercasing**: Convert all text to lowercase for consistency.
- **Stop word removal**: Eliminate common words like "the," "a," and "is" that don't contribute to sentiment analysis.
- **Punctuation mark removal**: Remove punctuation marks like commas, periods, and exclamation points.
- **Stemming**: Reducing the word to its base form of the derived stem
- **Lemmatizing**: Reducing the word to its root form (lemma)3. Build and train a model using `Bernoulli Naive Bayes Classifier`.
4. Visualize the result.
5. Evaluate the performance using `accuracy score`, `confusion matrix`, and `ROC-AUC curve`.
## Setup
1. Install dependencies
```
pip shell
pip install -r requirements.txt
```2. Run the app
```
pipenv shell
python main.py
```## Results
![]()
![]()