{"id":16392136,"url":"https://github.com/imdeepmind/rateprediction","last_synced_at":"2026-02-16T17:35:49.480Z","repository":{"id":39736567,"uuid":"177084879","full_name":"imdeepmind/RatePrediction","owner":"imdeepmind","description":"Rate Prediction using Amazon Review Dataset and Deep Learning","archived":false,"fork":false,"pushed_at":"2022-11-21T21:31:53.000Z","size":551,"stargazers_count":4,"open_issues_count":8,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-06-24T22:47:04.892Z","etag":null,"topics":["amazon-review-dataset","deep-learning","keras","lstm","machine-learning","nlp","python","rate-prediction","recurrent-neural-network","recurrent-neural-networks","sentiment-analysis"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/imdeepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-22T06:35:33.000Z","updated_at":"2024-02-21T23:10:29.000Z","dependencies_parsed_at":"2023-01-22T10:00:33.084Z","dependency_job_id":null,"html_url":"https://github.com/imdeepmind/RatePrediction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/imdeepmind/RatePrediction","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imdeepmind%2FRatePrediction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imdeepmind%2FRatePrediction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imdeepmind%2FRatePrediction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imdeepmind%2FRatePrediction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/imdeepmind","download_url":"https://codeload.github.com/imdeepmind/RatePrediction/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/imdeepmind%2FRatePrediction/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29514008,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-16T09:05:14.864Z","status":"ssl_error","status_checked_at":"2026-02-16T08:55:59.364Z","response_time":115,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-review-dataset","deep-learning","keras","lstm","machine-learning","nlp","python","rate-prediction","recurrent-neural-network","recurrent-neural-networks","sentiment-analysis"],"created_at":"2024-10-11T04:48:45.236Z","updated_at":"2026-02-16T17:35:49.463Z","avatar_url":"https://github.com/imdeepmind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Rate Prediction using Amazon Review Dataset\n\nPredicting star ratings using Amazon Review Dataset and LSTM Recurrent Neural Network.\n\n## Table of contents:\n- [Introduction](#introduction)\n- [Dataset](#dataset)\n- [Model](#model)\n- [Dependencies](#dependencies)\n- [File Structure](#file-structure)\n- [Future Improvements](#future-improvements)\n- [Acknowledgments](#acknowledgments)\n\n## Introduction\n\nMillions of people use Amazon to buy products. On Amazon, for every product, people can rate and write a review. If a product is good, it gets a positive review and gets a higher star rating, similarly, if a product is bad, it gets a negative review and lower star rating. My aim in this project is to predict star rating automatically based on the product review.\n\nIn Amazon, the range of star rating is 1 to 5. That means if the product review is negative, then it will get low star rating (possibly 1 or 2), if the product is average then it will get medium star rating (possibly 3), and if the product is good, then it will get higher star rating (possibly 4 or 5).\n\n** This project aims to make a system that automatically detects the star rating based on the review.**\n\nThis task is similar to Sentiment Analysis, but instead of predicting the positive and negative sentiment(sometimes neutral also), here I need to predict the star rating. \n\n\n## Dataset\n\nFor this project, I'm using the [Amazon Review Dataset](https://s3.amazonaws.com/amazon-reviews-pds/readme.html). Amazon Review Dataset is a gigantic collection of product reviews and their star rating. It contains more than 40 millions of reviews(I don't know the original number). \n\n\u003e Downloading instructions and other information about the dataset can be found on the dataset website.  \n\nIn this case, I'm just using a tiny fraction of the dataset, more specifically, I'm using the following files.\n- `amazon_reviews_us_Musical_Instruments_v1_00.tsv`\n- `amazon_reviews_us_Office_Products_v1_00.tsv`\n- `amazon_reviews_us_Music_v1_00`\n\nThe entire dataset is in `.tsv` format.\n\n## Model\n\n### 1. Step 1: Balancing the dataset - \nFirst of all, the dataset is unbalanced. In other words, there is more sample for one class than the other classes. The unbalanced dataset can cause several problems. To solve this problem, we need to balance the dataset. \n\nTo solve this problem and balance the data, I'll use `UnderSampler`. To learn about `UnderSampler`, click [here](https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.under_sampling.RandomUnderSampler.html)\n\n![Under Sampler](https://user-images.githubusercontent.com/34741145/62822814-2f0f2e00-bba6-11e9-8f04-f4ffde718066.png)\n\nAfter using `Under Sampler` on the dataset, the class distribution becomes balanced, or in other words, there are an equal number of sample for each class\n\n### Step 2: Word Tokenizing - \nWe all know that Machine Learning algorithms are just some math equations that perform some math operations to do all the amazing things. As these algorithms are just some math equations, they can only deal with numbers. Here in this project, we are dealing with product reviews. \n\nWork Tokenizing is a process of converting these word reviews into numbers.\n\nHere I'm using Keras Word Tokenizer. To learn more about Word Tokenizer, click [here](https://keras.io/preprocessing/text/).\n\n### Step 3: DL Model - \nFinally, let's talk about the Deep Learning model. The model that I'm using here is a combination of CNN and LSTM recurrent neural network.\n\n![Model](https://user-images.githubusercontent.com/34741145/62822813-2f0f2e00-bba6-11e9-8c03-36f2cabe3b45.png)\n\n## Dependencies\nFollowing are the dependencies of the project\n- Keras\n- Pandas\n- Numpy\n- ImBalance\n\n## File Structure\nThere are a total of two folders, `demo` and `machine-learning`. \n\nThe `demo` folder contains a demo app for the demo of this project. We can ignore it.\n\nThe `machine-learning` folder is the main part of the application. This folder contains two subfolders `model` and `preprocessing`.\nThe `model` folder contains the main model for this project. The `preprocessing` folder contains all the code for preprocessing the dataset.\n\n## Future Improvements\nCurrently the model in only 50% accurate. So I have a target to increase the accuracy to 75%.\n\n## Acknowledgments\n- [Amazon Review Dataset](https://s3.amazonaws.com/amazon-reviews-pds/readme.html)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimdeepmind%2Frateprediction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fimdeepmind%2Frateprediction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimdeepmind%2Frateprediction/lists"}