Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/5hraddha/sentiment-analysis
An innovative system for filtering and categorizing movie reviews
https://github.com/5hraddha/sentiment-analysis
countvectorizer dummyclassifier lgbmclassifier logisticregression matplotlib minmaxscaler nltk nltk-stopwords nltk-tokenizer numpy pandas seaborn spacy tfidfvectorizer torch tqdm transformers
Last synced: 6 days ago
JSON representation
An innovative system for filtering and categorizing movie reviews
- Host: GitHub
- URL: https://github.com/5hraddha/sentiment-analysis
- Owner: 5hraddha
- Created: 2024-07-25T14:38:02.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-07-25T14:51:35.000Z (3 months ago)
- Last Synced: 2024-10-10T11:43:11.003Z (26 days ago)
- Topics: countvectorizer, dummyclassifier, lgbmclassifier, logisticregression, matplotlib, minmaxscaler, nltk, nltk-stopwords, nltk-tokenizer, numpy, pandas, seaborn, spacy, tfidfvectorizer, torch, tqdm, transformers
- Language: Jupyter Notebook
- Homepage:
- Size: 23.8 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Sentiment Analysis for Moview Reviews
## Introduction
In the age of digital content and streaming services, movie reviews play a pivotal role in helping viewers make informed choices about what to watch. **The "Film Junky Union" project, an exciting venture for classic movie enthusiasts, seeks to revolutionize this experience by developing an innovative system for filtering and categorizing movie reviews**. By leveraging the power of machine learning, our primary objective is to create a model capable of automatically detecting negative movie reviews, aiding film aficionados in avoiding cinematic disappointments.To achieve this, we will utilize a comprehensive dataset of IMDb movie reviews, with polarity labels indicating whether a review is positive or negative. Through data preprocessing, exploratory data analysis (EDA), model training, and rigorous testing, **we aim to construct a robust classifier capable of achieving an F1 score of at least 0.85**. The project's findings will not only provide valuable insights into sentiment analysis within the film industry but also empower movie enthusiasts to make more informed viewing decisions.
## Project Goal
Few of our main goals are:
1. **Data Preprocessing**: Clean and preprocess the IMDb movie review dataset, including handling missing values, text cleaning, and tokenization.
2. **Exploratory Data Analysis (EDA)**: Perform EDA to gain insights into the data distribution, class balance, and other characteristics of the dataset.
3. **Sentiment Analysis Model**: Develop a sentiment analysis model that can classify movie reviews as positive or negative based on their text content.
4. **F1 Score of 0.85**: Achieve a minimum F1 score of 0.85 to ensure the model's accuracy in detecting negative reviews.