{"id":24486215,"url":"https://github.com/05afreen/movie-data-analysis","last_synced_at":"2026-04-18T13:34:41.219Z","repository":{"id":263268196,"uuid":"886651725","full_name":"05afreen/Movie-Data-Analysis","owner":"05afreen","description":"This project explores trends in movie data, focusing on ratings, genres, and review sentiment. Using BeautifulSoup and Selenium, data is scraped from IMDb. The analysis includes data cleaning, normalization, and visualization using Matplotlib, Seaborn, and Power BI. Key insights include genre popularity, rating distribution","archived":false,"fork":false,"pushed_at":"2025-01-20T05:05:35.000Z","size":9745,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-20T06:20:38.538Z","etag":null,"topics":["beautifulsoup","matplotlib","pandas","powerbi","python","seaborn","selenium"],"latest_commit_sha":null,"homepage":"https://drive.google.com/file/d/1HutEowmrAsiW6UGYep_oV2sNL6NIr-5h/view?usp=sharing","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/05afreen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-11T11:09:26.000Z","updated_at":"2025-01-20T05:06:52.000Z","dependencies_parsed_at":null,"dependency_job_id":"91c18668-3efb-4df8-93f0-52165331d5dc","html_url":"https://github.com/05afreen/Movie-Data-Analysis","commit_stats":null,"previous_names":["05afreen/movie-data-sentiment-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/05afreen%2FMovie-Data-Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/05afreen%2FMovie-Data-Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/05afreen%2FMovie-Data-Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/05afreen%2FMovie-Data-Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/05afreen","download_url":"https://codeload.github.com/05afreen/Movie-Data-Analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243654496,"owners_count":20325913,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","matplotlib","pandas","powerbi","python","seaborn","selenium"],"created_at":"2025-01-21T14:47:27.468Z","updated_at":"2025-03-14T22:26:25.176Z","avatar_url":"https://github.com/05afreen.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Movie-Data-Analysis\n\n\n### **Project Overview**:\n\nScrape movie data, such as ratings, genres, and reviews, and visualize trends in movie ratings, popular genres, and sentiment analysis of reviews.\n\n### **Steps**:\n\n- **Web Scraping**:\n    - Scrape data from movie websites like **IMDB**, **Rotten Tomatoes**, or **TMDb** using **BeautifulSoup** or **Selenium**.\n    - Gather movie attributes such as ratings, genres, and customer reviews.\n- **Data Cleaning**:\n    - Handle missing values in ratings and reviews, and clean the data for analysis (e.g., normalizing rating scales).\n- **Data Visualization**:\n    - **Matplotlib/Seaborn**: Create histograms for movie rating distributions, pie charts for genre popularity, and line graphs to show rating trends over time.\n    - **Power BI**: Build an interactive dashboard that lets users filter movies by genre, rating, and release year.\n### **Skills Covered**:\n\n- Web scraping (using **BeautifulSoup**, **Selenium**).\n- Data cleaning (normalizing rating scales, handling missing data).\n- Data visualization (histograms, pie charts, time series, Power BI).\n- Sentiment analysis\n\n### **Insights**:\n\n1. **Genre Popularity**: **Action** and **Drama** are the most popular movie genres, with **Comedy** trailing.\n2. **Rating Distribution**: Most movies on platforms like **IMDB** tend to have a rating between **6-8**, indicating moderate viewer satisfaction.\n3. **Sentiment in Reviews**: Positive reviews highlight movie **storyline** and **acting**, while negative reviews often focus on **predictability** and **pacing**.\n4. **Trends in Movie Ratings**: Movie ratings tend to improve slightly over time as new ratings are added and older, negative reviews fade.\n5. **Cultural Influence**: Movies from certain regions (e.g., **Hollywood** vs **Bollywood**) show stark differences in ratings, with **Hollywood** movies receiving higher average ratings globally.\n6. **Ratings vs Box Office**: Higher-rated movies tend to perform better at the box office, though this is not always true for niche genres.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F05afreen%2Fmovie-data-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F05afreen%2Fmovie-data-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F05afreen%2Fmovie-data-analysis/lists"}