https://github.com/shibin08/sentiment-analysis-movie-reviews
A sentiment analysis project on IMDb movie reviews using Natural Language Processing (NLP) techniques. Text data is cleaned, vectorized using TF-IDF, and classified using machine learning models like Logistic Regression and Random Forest. Achieved high accuracy in distinguishing positive and negative reviews.
https://github.com/shibin08/sentiment-analysis-movie-reviews
logistic-regression machine-learning movie-reviews natural-language-processing random-forest scikit-learn sentiment-analysis text-classification tf-idf
Last synced: 4 months ago
JSON representation
A sentiment analysis project on IMDb movie reviews using Natural Language Processing (NLP) techniques. Text data is cleaned, vectorized using TF-IDF, and classified using machine learning models like Logistic Regression and Random Forest. Achieved high accuracy in distinguishing positive and negative reviews.
- Host: GitHub
- URL: https://github.com/shibin08/sentiment-analysis-movie-reviews
- Owner: Shibin08
- License: mit
- Created: 2025-06-25T16:50:04.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-06-27T19:05:05.000Z (4 months ago)
- Last Synced: 2025-06-27T19:45:33.813Z (4 months ago)
- Topics: logistic-regression, machine-learning, movie-reviews, natural-language-processing, random-forest, scikit-learn, sentiment-analysis, text-classification, tf-idf
- Language: Jupyter Notebook
- Homepage:
- Size: 28.7 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Sentiment-Analysis-Movie-Reviews
This is a machine learning model aimed at analyzing the sentiment of IMDb movie reviews. The objective is to classify reviews as **positive** or **negative** using **TF-IDF vectorization** and **machine learning models** like Logistic Regression and Random Forest.# Objective
To build a text classification model that identifies sentiment from movie reviews using classical machine learning techniques.# Dataset
https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews# Tools and Libraries
- Scikit-learn
- Python
- SpaCy
- Jupyter Notebook
- Pandas, NumPy
- TF-IDF Vectorizer
- Matplotlib / Seaborn# Results
- **Logistic Regression Accuracy:** ~87%
- **Random Forest Accuracy:** ~84%
- **SVC Accuracy:** ~85%
- Evaluation done using: Accuracy Score, Confusion Matrix, and F1-Score# Team members
**Group No. 32**
- Santwana Behara(Team Leader)
- Mohammad Rakshanda
- Majji Vivek
- Shibin Malakot