Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sminerport/imdbsentimentclassifier
A sentiment analysis model trained on the IMDb Movie Reviews Dataset to classify reviews as positive or negative. This project uses a Bidirectional LSTM with GloVe embeddings, batch normalization, and regularization to improve accuracy and generalization. Includes data preprocessing, model training, and evaluation.
https://github.com/sminerport/imdbsentimentclassifier
bidirectional-lstm deep-learnning git-lfs glove imdb-dataset keras lstm machine-learning movie-review-app natural-language-processing nlp opinion-mining python sentimental-analysis tensorflow text-classification
Last synced: 21 days ago
JSON representation
A sentiment analysis model trained on the IMDb Movie Reviews Dataset to classify reviews as positive or negative. This project uses a Bidirectional LSTM with GloVe embeddings, batch normalization, and regularization to improve accuracy and generalization. Includes data preprocessing, model training, and evaluation.
- Host: GitHub
- URL: https://github.com/sminerport/imdbsentimentclassifier
- Owner: sminerport
- License: mit
- Created: 2024-11-07T08:02:08.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2024-11-08T20:08:42.000Z (3 months ago)
- Last Synced: 2024-11-08T21:19:25.076Z (3 months ago)
- Topics: bidirectional-lstm, deep-learnning, git-lfs, glove, imdb-dataset, keras, lstm, machine-learning, movie-review-app, natural-language-processing, nlp, opinion-mining, python, sentimental-analysis, tensorflow, text-classification
- Language: Python
- Homepage: https://scottminer.netlify.app
- Size: 25 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# IMDb Sentiment Classifier
This project is a sentiment classifier for IMDb movie reviews. It uses a pre-trained GloVe word embedding model and a Bidirectional LSTM network to classify reviews as positive or negative.
## Features
- Loads IMDb movie reviews for training, validation, and testing.
- Uses [GloVe embeddings](https://nlp.stanford.edu/projects/glove/) for enhanced text representation.
- Trains a [Bidirectional LSTM (Long Short-Term Memory)](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) model to classify reviews as positive or negative.
- Achieves high accuracy on both validation and test sets.## Setup
1. Clone this repository:
```bash
git clone https://github.com/sminerport/IMDbSentimentClassifier.git
cd IMDbSentimentClassifier
```2. Install dependencies:
```bash
pip install -r requirements.txt
```3. Run the model:
```bash
python src/main.py
```## Data
The model uses IMDb review data split into training, validation, and test sets. These files are stored in the `data/` directory and are managed with **Git Large File Storage (Git LFS)** to optimize storage and download efficiency.
To ensure access to the data files, please install **Git LFS** if you haven’t already. You can download Git LFS [here](https://git-lfs.github.com/).
```bash
# Install Git LFS
git lfs install
```Then, clone the repository as usual:
```bash
git clone https://github.com/sminerport/IMDbSentimentClassifier.git
cd IMDbSentimentClassifier
```If you’ve already cloned the repository without Git LFS, run the following command to pull the LFS files:
```bash
git lfs pull
```## Usage
To train the model:
```bash
python src/main.py
```After running, the script will automatically download and clean up GloVe embeddings to save space.
## Model Training Output
Below is a snapshot of the model's training and validation accuracy and loss across epochs:
![Model Training Output](images/model-output.png)
This image provides a visual summary of the training process. Each epoch displays the model's accuracy and loss on both the training and validation sets, showing the progression as the model improves over time.
## Cleanup
The script will delete the GloVe embeddings and the saved model (`best_model.keras`) after evaluation to conserve storage. If you'd like to keep these files, set the `cleanup` variable to `False` in the script.
## Notes
- To adjust storage usage, toggle the `cleanup` variable in the script.
- `requirements.txt` is generated by running `pip freeze > requirements.txt` in a Colab environment or your local environment.## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.