{"id":25239715,"url":"https://github.com/singhxtushar/imdb-analysis","last_synced_at":"2026-02-16T17:03:21.810Z","repository":{"id":274758110,"uuid":"923970234","full_name":"SINGHxTUSHAR/IMDB-Analysis","owner":"SINGHxTUSHAR","description":"IMDB-Analysis is a sentiment Analysis project based on movie review, whether it is +ve or -ve. Model is design with a simple RNN architecture and embedded with word2vec. Deployed on streamlit web-app open cloud service.","archived":false,"fork":false,"pushed_at":"2025-02-09T17:16:22.000Z","size":14739,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-09T17:33:56.498Z","etag":null,"topics":["embedding-models","imdb","rnn","streamlit-webapp","tensorflow"],"latest_commit_sha":null,"homepage":"https://imdb-analysis-nuubf6y9psamjqz4ak3ide.streamlit.app/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SINGHxTUSHAR.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-29T06:59:17.000Z","updated_at":"2025-02-09T17:16:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"fc2f6f5d-4f8e-465b-b004-63f3d32388c1","html_url":"https://github.com/SINGHxTUSHAR/IMDB-Analysis","commit_stats":null,"previous_names":["singhxtushar/imdb-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SINGHxTUSHAR%2FIMDB-Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SINGHxTUSHAR%2FIMDB-Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SINGHxTUSHAR%2FIMDB-Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SINGHxTUSHAR%2FIMDB-Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SINGHxTUSHAR","download_url":"https://codeload.github.com/SINGHxTUSHAR/IMDB-Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238353539,"owners_count":19457857,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embedding-models","imdb","rnn","streamlit-webapp","tensorflow"],"created_at":"2025-02-11T18:59:57.414Z","updated_at":"2025-10-26T15:30:35.461Z","avatar_url":"https://github.com/SINGHxTUSHAR.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![GitHub license](https://img.shields.io/github/license/SINGHxTUSHAR/IMDB-Analysis.svg)](https://github.com/SINGHxTUSHAR/IMDB-Analysis/blob/master/LICENSE)\n[![GitHub contributors](https://img.shields.io/github/contributors/SINGHxTUSHAR/IMDB-Analysis.svg)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/graphs/contributors/)\n[![GitHub issues](https://img.shields.io/github/issues/SINGHxTUSHAR/IMDB-Analysis.svg)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/issues/)\n[![GitHub pull-requests](https://img.shields.io/github/issues-pr/SINGHxTUSHAR/IMDB-Analysis.svg)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/pulls/)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)\n\n\n[![GitHub watchers](https://img.shields.io/github/watchers/SINGHxTUSHAR/IMDB-Analysis.svg?style=social\u0026label=Watch\u0026maxAge=2592000)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/watchers/)\n[![GitHub forks](https://img.shields.io/github/forks/SINGHxTUSHAR/IMDB-Analysis.svg?style=social\u0026label=Fork\u0026maxAge=2592000)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/network/)\n[![GitHub stars](https://img.shields.io/github/stars/SINGHxTUSHAR/IMDB-Analysis.svg?style=social\u0026label=Star\u0026maxAge=2592000)](https://GitHub.com/SINGHxTUSHAR/IMDB-Analysis/stargazers/)\n\n[![Open in Visual Studio Code](https://img.shields.io/static/v1?logo=visualstudiocode\u0026label=\u0026message=Open%20in%20Visual%20Studio%20Code\u0026labelColor=2c2c32\u0026color=007acc\u0026logoColor=007acc)](https://open.vscode.dev/SINGHxTUSHAR/IMDB-Analysis)\n\n\n# IMDB-Analysis\n\n![preview Image](https://github.com/SINGHxTUSHAR/IMDB-Analysis/blob/6640b0dd684a74116256dcf940602a660f32de8b/preview.png)\n\n\n#### Project Overview:\nSentiment analysis on the IMDB dataset is a popular Natural Language Processing (NLP) task where the goal is to classify movie reviews as positive or negative. In this analysis, we use a Simple Recurrent Neural Network (RNN) with Embedding layers to capture the sequential nature of text and determine sentiment.\n\nLet me explain how this sentiment analysis system works:\n\n* `Data Preparation`: We use the IMDB dataset, which contains 50,000 movie reviews labeled as positive or negative\nEach review is preprocessed into sequences of word indices\nWe limit our vocabulary to the top 10,000 most frequent words to manage complexity\nReviews are padded to a fixed length of 500 words to ensure uniform input size\n\n\n* `Embedding Layer`: The first layer is an Embedding layer that converts word indices into dense vectors\nEach word is mapped to a 32-dimensional vector space\nThis allows the model to learn semantic relationships between words\nSimilar words end up closer together in this embedding space\n\n\n* `Simple RNN Layer`: The SimpleRNN layer processes the sequence of word embeddings\nIt maintains a hidden state that captures information about previous words\nAt each time step, it combines the current input with its previous hidden state\nThis allows the model to understand context and word relationships\nWe use 32 units in the RNN layer to capture different aspects of the sequence\n\n\n* `Output Layer`: A Dense layer with sigmoid activation produces the final prediction output is a single number between 0 and 1Values closer to 1 indicate positive sentiment values closer to 0 indicate negative sentiment\n\n\n* `Training Process`: The model is trained using binary cross-entropy loss\nAdam optimizer is used for efficient training\nTraining happens in batches of 32 reviews\nThe model trains for 5 epochs (complete passes through the dataset)\nValidation data helps monitor for overfitting\n\n\nThis model typically achieves an accuracy of around 85% on the test set, which is quite good for a simple RNN architecture. However, there are some limitations:\n\nSimple RNNs can struggle with long-term dependencies due to the vanishing gradient problem\nThe fixed sequence length might truncate longer reviews\nThe limited vocabulary might miss some important but rare words.\n\n\n\n## Requirements💻 :\n\nEnsure you have the following dependencies installed:\n\n- Python (version 3.11.x || 3.12.x)\n- IDE: VS-CODE or collab\n- Virtual-environment(venv)\n- Other dependencies (refer to the requirement.txt)\n\nYou can install the required Python packages using:\n\n```bash\npip install -r requirement.txt\n```\n\n\n## Setup 💿:\n\n- Clone the repository:\n```bash\ngit clone https://github.com/SINGHxTUSHAR/NextWordAI.git\ncd IMDB-Analysis\n```\n- Create a virtual environment (optional but recommended):\n```bash\npython -m venv venv\n```\n- Activate the virtual environment:\n  - On Windows:\n   ```bash\n   venv\\Scripts\\activate\n   ```\n  - On macOS/Linux:\n  ```bash\n  source venv/bin/activate\n  ```\n\n\n## Contributing 📌:\nIf you'd like to contribute to this project, please follow the standard GitHub fork and pull request process. Contributions, issues, and feature requests are welcome!\n\n## Suggestion 🚀: \nIf you have any suggestions for me related to this project, feel free to contact me at tusharsinghrawat.delhi@gmail.com or \u003ca href=\"https://www.linkedin.com/in/singhxtushar/\"\u003eLinkedIn\u003c/a\u003e.\n\n## License 📝:\nThis project is licensed under the \u003ca href=\"https://github.com/SINGHxTUSHAR/IMDB-Analysis/blob/main/LICENSE\"\u003eMIT License\u003c/a\u003e - see the LICENSE file for details.\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsinghxtushar%2Fimdb-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsinghxtushar%2Fimdb-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsinghxtushar%2Fimdb-analysis/lists"}