https://github.com/daniel-lima-lopez/sentiment-classification-with-naive-bayes-classifier

Python implementation of the Naive Bayes classifier, applied to the problem of classifying sentiments of Amazon product reviews
https://github.com/daniel-lima-lopez/sentiment-classification-with-naive-bayes-classifier

machine-learning naive-bayes naive-bayes-algorithm naive-bayes-classifier nlp python reviews

Last synced: 14 days ago
JSON representation

Python implementation of the Naive Bayes classifier, applied to the problem of classifying sentiments of Amazon product reviews

Host: GitHub
URL: https://github.com/daniel-lima-lopez/sentiment-classification-with-naive-bayes-classifier
Owner: daniel-lima-lopez
Created: 2024-08-28T19:25:15.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-08-29T03:12:56.000Z (11 months ago)
Last Synced: 2025-02-07T01:45:41.764Z (5 months ago)
Topics: machine-learning, naive-bayes, naive-bayes-algorithm, naive-bayes-classifier, nlp, python, reviews
Language: Jupyter Notebook
Homepage:
Size: 1.01 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Sentiment-classification-with-Naive-Bayes-classifier

In this repository, a sentiment analysis of the [Amazon Reviews](https://www.kaggle.com/datasets/danielihenacho/amazon-reviews-dataset) dataset is implemented. This dataset contains Amazon product reviews, classified into three categories: positive, neutral and negative. The Naive Bayes classifier is implemented in Python to perform the classification task on the dataset.

## Installation

Clone this repository:

```bash

git clone [email protected]:daniel-lima-lopez/Sentiment-classification-with-Naive-Bayes-classifier.git

```

move to installation directory:

```bash

cd Sentiment-classification-with-Naive-Bayes-classifier

```

## Method description

The Naive Bayes classifier is implemented in the [NB.py](NB.py) code, where its conventional definition is considered. The basic idea of this classifier is, for each class, to calculate the conditional probability $P(c|d)$, that is, the probability that an instance $d$ belongs to class $c$, given the characteristics of $d$, class d is chosen as the one that maximizes this probability. Considering the Bayes' rule, the predicted class $\hat{c}$ is calculated as follows:

$$\hat{c} = \begin{matrix}

argmax\\

c\in C

\end{matrix}\,\, P(c|d)=\begin{matrix}

argmax\\

c\in C

\end{matrix}\,\,\frac{P(d|c)P(c)}{P(d)}$$

where $P(d)$ is ignored, since its calculation independent of the class and $P(d|c)$ and $P(c)$ are calculated based on the word occurrences in the training data.

## Experiments

The following experiments can be executed in the notebook [experiments.ipynb](experiments.ipynb).

First, the following pie chart shows the distribution of classes in the dataset:







Since there are imbalanced classes, we opt for F1-score as a performance metric. Experiments were performed with the Naive Bayes classifier and its binary counting variant, which is implemented by eliminating repeated words in every instance, this option is can be used defining `binary_count=True` on the declaration of the implemented class (`NaiveBayes()`). Both classifiers were evaluated with the F1 score under the same training-test split.

The F1-scores obtained are 0.6313 and 0.6062 for the Naive Bayes classifier and its binary-counting variant, respectively. In addition, the confusion matrices are shown below:



Naive Bayes 











Binary-counting Naive Bayes 









F1-score measures reveal that the conventional Naive Bayes classifier performs better than its binary counting variant. Furthermore, confusion matrices show that the conventional approach presents fewer misclassifications in classes with fewer instances (negative and neutral), which are the most difficult classes to classify, considering the imbalance in the classes presented in the data.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/daniel-lima-lopez/sentiment-classification-with-naive-bayes-classifier

Awesome Lists containing this project

README