https://github.com/5hraddha/sentiment-analysis

An innovative system for filtering and categorizing movie reviews
https://github.com/5hraddha/sentiment-analysis

countvectorizer dummyclassifier lgbmclassifier logisticregression matplotlib minmaxscaler nltk nltk-stopwords nltk-tokenizer numpy pandas seaborn spacy tfidfvectorizer torch tqdm transformers

Last synced: 27 days ago
JSON representation

An innovative system for filtering and categorizing movie reviews

Host: GitHub
URL: https://github.com/5hraddha/sentiment-analysis
Owner: 5hraddha
Created: 2024-07-25T14:38:02.000Z (9 months ago)
Default Branch: main
Last Pushed: 2024-07-25T14:51:35.000Z (9 months ago)
Last Synced: 2025-02-10T23:51:12.311Z (3 months ago)
Topics: countvectorizer, dummyclassifier, lgbmclassifier, logisticregression, matplotlib, minmaxscaler, nltk, nltk-stopwords, nltk-tokenizer, numpy, pandas, seaborn, spacy, tfidfvectorizer, torch, tqdm, transformers
Language: Jupyter Notebook
Homepage:
Size: 23.8 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Sentiment Analysis for Moview Reviews

## Introduction

In the age of digital content and streaming services, movie reviews play a pivotal role in helping viewers make informed choices about what to watch. **The "Film Junky Union" project, an exciting venture for classic movie enthusiasts, seeks to revolutionize this experience by developing an innovative system for filtering and categorizing movie reviews**. By leveraging the power of machine learning, our primary objective is to create a model capable of automatically detecting negative movie reviews, aiding film aficionados in avoiding cinematic disappointments.To achieve this, we will utilize a comprehensive dataset of IMDb movie reviews, with polarity labels indicating whether a review is positive or negative. Through data preprocessing, exploratory data analysis (EDA), model training, and rigorous testing, **we aim to construct a robust classifier capable of achieving an F1 score of at least 0.85**. The project's findings will not only provide valuable insights into sentiment analysis within the film industry but also empower movie enthusiasts to make more informed viewing decisions.

## Project Goal

Few of our main goals are:

1. **Data Preprocessing**: Clean and preprocess the IMDb movie review dataset, including handling missing values, text cleaning, and tokenization.

2. **Exploratory Data Analysis (EDA)**: Perform EDA to gain insights into the data distribution, class balance, and other characteristics of the dataset.

3. **Sentiment Analysis Model**: Develop a sentiment analysis model that can classify movie reviews as positive or negative based on their text content.

4. **F1 Score of 0.85**: Achieve a minimum F1 score of 0.85 to ensure the model's accuracy in detecting negative reviews.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/5hraddha/sentiment-analysis

Awesome Lists containing this project

README