{"id":29311555,"url":"https://github.com/hridya2001/cinema-on-cloud","last_synced_at":"2025-10-17T09:32:43.370Z","repository":{"id":299382000,"uuid":"1002836257","full_name":"Hridya2001/Cinema-on-Cloud","owner":"Hridya2001","description":"Cloud-powered SQL project using IMDb data – cleaned locally, stored in AWS RDS, and explored via DBeaver.","archived":false,"fork":false,"pushed_at":"2025-06-16T09:28:57.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-16T09:34:23.769Z","etag":null,"topics":["aws-ec2-intances","aws-rds","dbeaver","mysql-database"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Hridya2001.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-16T08:10:33.000Z","updated_at":"2025-06-16T09:29:01.000Z","dependencies_parsed_at":"2025-06-16T09:37:16.451Z","dependency_job_id":"b9701c57-d8e1-481d-b9c6-d43b85f73db4","html_url":"https://github.com/Hridya2001/Cinema-on-Cloud","commit_stats":null,"previous_names":["hridya2001/cinema-on-cloud"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Hridya2001/Cinema-on-Cloud","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hridya2001%2FCinema-on-Cloud","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hridya2001%2FCinema-on-Cloud/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hridya2001%2FCinema-on-Cloud/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hridya2001%2FCinema-on-Cloud/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Hridya2001","download_url":"https://codeload.github.com/Hridya2001/Cinema-on-Cloud/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hridya2001%2FCinema-on-Cloud/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264040974,"owners_count":23548077,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-ec2-intances","aws-rds","dbeaver","mysql-database"],"created_at":"2025-07-07T08:15:09.595Z","updated_at":"2025-10-17T09:32:43.365Z","avatar_url":"https://github.com/Hridya2001.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# IMDB to Cloud: A Casual SQL Journey ...\n\nEver wondered what the most loved movies of the last decade are?\n\nIn this mini project, I took the massive IMDb dataset from [Kaggle](https://www.kaggle.com/datasets/ashirwadsangwan/imdb-dataset), filtered out the noise, ran it through a cloud-powered MySQL setup, and surfaced the highest-voted titles from the last 10 years. All using a mix of SQL, AWS, and some trial-and-error magic.\n\n---\n\n## Architecture Overview\n\nHere’s a high-level view of how everything is wired together:\n\n![Architecture Diagram](Images/image.png)\n\n---\n\n## What’s This Project About?\n\n- Downloaded IMDb datasets from Kaggle (a huge one with lots of different tables).\n- Loaded it into **local MySQL** and cleaned,modified it.\n- Focused only on two specific tables:\n  - `movie_title` (which I customized),\n  - and `title_rating`.\n- Joined these two to create a new table called `high_rated_titles`.\n\nAnd then I thought... why keep it local?\n\n---\n\n## Moving to the Cloud\n\nTo make things more \"cloudy\":\n- Set up an **EC2 instance** and **RDS MySQL** database inside a **custom VPC with subnets**.\n- Transferred the local database to RDS.\n- Created an **SSH tunnel** so I could connect DBeaver with RDS.\n\n\u003e Honestly though, I later realized I could’ve done everything through my terminal too. So, the DBeaver setup was more for convenience.\n\n---\n\n## Tech Stack\n\n- **Language:** SQL\n- **Tools:** MySQL, AWS RDS, EC2, VPC, Linux terminal, DBeaver\n- **Data Source:** [IMDb Dataset on Kaggle](https://www.kaggle.com/datasets/ashirwadsangwan/imdb-dataset)\n\n---\n\n## Final Output\n\nUsing SQL queries on the cloud-hosted DB, I filtered out:\n\n\u003e **Movie Titles**,  \n\u003e **Release Year**,  \n\u003e **Number of Votes**\n\n…for the most voted titles released during the **last decade** (2015–2024).\n\n---\n\n##  All Commands \u0026 Queries\n\nCurious about the exact steps and SQL magic behind this project?\n\n Check out [`commands_and_queries.md`](Code)\n\nThis file includes:\n\n-  All **MySQL queries** (table creation, joins, filters, etc.)\n-  Commands to **set up AWS services** – RDS, EC2, and VPC\n-  Steps to **create an SSH tunnel** from local to RDS\n-  How I installed and used **DBeaver** on my Ubuntu machine\n\n---\n\n\n## DBeaver Exploration\n\nConnected the cloud-hosted MySQL DB to DBeaver using an SSH tunnel. Ran the final query and viewed the results in a nice tabular format.\n\nHere’s a peek:\n\n![Shot](Images/DBeaver.png)\n\n---\n\n\n## Why I Did This?\n\nJust wanted to get hands-on with:\n- Real-world SQL on large datasets.\n- Transferring MySQL databases from local to cloud (EC2 + RDS).\n- Using DBeaver for database inspection and query writing.\n- And of course, understanding **how to filter meaningful insights from massive data**.\n\n---\n\n## Learnings \u0026 Reflections\n\n- SQL can be powerful and fun when you know what you're digging for.\n- DBeaver is cool, but not a must. The terminal works just fine too.\n- Moving DBs to the cloud is easier than I thought—but getting all the VPC, subnet, and security group configs right takes some trial and error!\n\n---\n\n## Next Steps?\n\nMaybe try visualizing the results using tools like Power BI, Superset, or even Python dashboards. Or build a Streamlit app on top of it… who knows!\n\n\nThanks for reading!\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhridya2001%2Fcinema-on-cloud","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhridya2001%2Fcinema-on-cloud","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhridya2001%2Fcinema-on-cloud/lists"}