Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/5hraddha/sentiment-analysis

An innovative system for filtering and categorizing movie reviews
https://github.com/5hraddha/sentiment-analysis

countvectorizer dummyclassifier lgbmclassifier logisticregression matplotlib minmaxscaler nltk nltk-stopwords nltk-tokenizer numpy pandas seaborn spacy tfidfvectorizer torch tqdm transformers

Last synced: 6 days ago
JSON representation

An innovative system for filtering and categorizing movie reviews

Awesome Lists containing this project

README

        

# Sentiment Analysis for Moview Reviews

## Introduction

In the age of digital content and streaming services, movie reviews play a pivotal role in helping viewers make informed choices about what to watch. **The "Film Junky Union" project, an exciting venture for classic movie enthusiasts, seeks to revolutionize this experience by developing an innovative system for filtering and categorizing movie reviews**. By leveraging the power of machine learning, our primary objective is to create a model capable of automatically detecting negative movie reviews, aiding film aficionados in avoiding cinematic disappointments.To achieve this, we will utilize a comprehensive dataset of IMDb movie reviews, with polarity labels indicating whether a review is positive or negative. Through data preprocessing, exploratory data analysis (EDA), model training, and rigorous testing, **we aim to construct a robust classifier capable of achieving an F1 score of at least 0.85**. The project's findings will not only provide valuable insights into sentiment analysis within the film industry but also empower movie enthusiasts to make more informed viewing decisions.

## Project Goal

Few of our main goals are:

1. **Data Preprocessing**: Clean and preprocess the IMDb movie review dataset, including handling missing values, text cleaning, and tokenization.

2. **Exploratory Data Analysis (EDA)**: Perform EDA to gain insights into the data distribution, class balance, and other characteristics of the dataset.

3. **Sentiment Analysis Model**: Develop a sentiment analysis model that can classify movie reviews as positive or negative based on their text content.

4. **F1 Score of 0.85**: Achieve a minimum F1 score of 0.85 to ensure the model's accuracy in detecting negative reviews.