{"id":28509770,"url":"https://github.com/shrunga92/netflix_data_analysis_sql","last_synced_at":"2026-02-24T21:31:47.370Z","repository":{"id":285884290,"uuid":"959653205","full_name":"shrunga92/Netflix_Data_Analysis_SQL","owner":"shrunga92","description":"This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL.","archived":false,"fork":false,"pushed_at":"2025-04-03T06:30:32.000Z","size":1574,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-02T13:13:33.939Z","etag":null,"topics":["sql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shrunga92.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-03T06:16:35.000Z","updated_at":"2025-04-03T06:31:23.000Z","dependencies_parsed_at":"2025-04-03T07:42:34.023Z","dependency_job_id":null,"html_url":"https://github.com/shrunga92/Netflix_Data_Analysis_SQL","commit_stats":null,"previous_names":["shrunga92/netflix_data_analysis_sql"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/shrunga92/Netflix_Data_Analysis_SQL","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrunga92%2FNetflix_Data_Analysis_SQL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrunga92%2FNetflix_Data_Analysis_SQL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrunga92%2FNetflix_Data_Analysis_SQL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrunga92%2FNetflix_Data_Analysis_SQL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shrunga92","download_url":"https://codeload.github.com/shrunga92/Netflix_Data_Analysis_SQL/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrunga92%2FNetflix_Data_Analysis_SQL/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29801021,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-24T21:02:39.706Z","status":"ssl_error","status_checked_at":"2026-02-24T21:02:21.834Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["sql"],"created_at":"2025-06-08T22:30:48.496Z","updated_at":"2026-02-24T21:31:47.354Z","avatar_url":"https://github.com/shrunga92.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"#     Netflix Movies and TV Shows Data Analysis using SQL\n\n![Netflix_Logo_RGB](https://github.com/user-attachments/assets/c86dfc7a-9fd2-4628-9035-053d72cf32e2)\n\n## Overview\nThis project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. \n\n## Objectives\n\n- Analyze the distribution of content types (movies vs TV shows).\n- Identify the most common ratings for movies and TV shows.\n- List and analyze content based on release years, countries, and durations.\n- Explore and categorize content based on specific criteria and keywords.\n\n## Schema\n\n```sql\nDROP TABLE IF EXISTS netflix;\nCREATE TABLE netflix\n(\n    show_id      VARCHAR(5),\n    type         VARCHAR(10),\n    title        VARCHAR(250),\n    director     VARCHAR(550),\n    casts        VARCHAR(1050),\n    country      VARCHAR(550),\n    date_added   VARCHAR(55),\n    release_year INT,\n    rating       VARCHAR(15),\n    duration     VARCHAR(15),\n    listed_in    VARCHAR(250),\n    description  VARCHAR(550)\n);\n```\n\n## Business Problems and Solutions\n\n### 1. Count the Number of Movies vs TV Shows\n```sql\nselect type, count(*) \nfrom netflix\ngroup by type;\n```\n###  2. Find the most common rating for movies and TV shows\n```sql\nSELECT type , rating as most_common_rating from \n(\nselect type,  rating , count(*),\nrank() over(partition by type order by count(*) desc) as rank\nfrom netflix\ngroup by type , rating\n)\nwhere rank = 1;\n```\n### 3. List all movies released in a specific year (e.g., 2020)\n```sql\nselect * from netflix\nwhere type = 'Movie' and release_year = '2020';\n```\n###  4. Find the top 5 countries with the most content on Netflix\n```sql\nselect country, count(show_id) \nfrom netflix\ngroup by country\norder by 2 DESC\nLIMIT 5;\n```\n###  5. Find all the movies/TV shows by director 'Rajiv Chilaka'!\n```sql\nSELECT *\nFROM netflix\nWHERE \n\tdirector = 'Rajiv Chilaka';\n```\n###  6. Find each year and the average numbers of content release by India on netflix. \n###  return top 5 year with highest avg content release !\n```sql\nSELECT \n\tcountry,\n\trelease_year,\n\tCOUNT(show_id) as total_release,\n\tROUND(\n\t\tCOUNT(show_id)::numeric/\n\t\t\t\t\t\t\t\t(SELECT COUNT(show_id) FROM netflix WHERE country = 'India')::numeric * 100 \n\t\t,2\n\t\t)\n\t\tas avg_release\nFROM netflix\nWHERE country = 'India' \nGROUP BY country, 2\nORDER BY avg_release DESC \nLIMIT 5;\n```\n### 7. List all movies that are documentaries\n```sql\nSELECT * FROM netflix\nWHERE listed_in LIKE '%Documentaries'\n```\n### 8. Find all content without a director\n```sql\nSELECT * FROM netflix\nWHERE director IS NULL\n```\n### 9. Find how many movies actor 'Salman Khan' appeared in last 10 years!\n```sql\nSELECT * FROM netflix\nWHERE \n\tcasts LIKE '%Salman Khan%'\n\tAND \n\trelease_year \u003e EXTRACT(YEAR FROM CURRENT_DATE) - 10\n```\n\n### 10. Find the top 10 actors who have appeared in the highest number of movies produced in India.\n```sql\nSELECT \n\tUNNEST(STRING_TO_ARRAY(casts, ',')) as actor,\n\tCOUNT(*)\nFROM netflix\nWHERE country = 'India'\nGROUP BY 1\nORDER BY 2 DESC\nLIMIT 10\n```\n### 15. Categorize Content Based on the Presence of 'Kill' and 'Violence' Keywords and count \n```sql\nSELECT \n    category,\n\tTYPE,\n    COUNT(*) AS content_count\nFROM (\n    SELECT \n\t\t*,\n        CASE \n            WHEN description ILIKE '%kill%' OR description ILIKE '%violence%' THEN 'Bad'\n            ELSE 'Good'\n        END AS category\n    FROM netflix\n) AS categorized_content\nGROUP BY 1,2\nORDER BY 2;\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrunga92%2Fnetflix_data_analysis_sql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshrunga92%2Fnetflix_data_analysis_sql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrunga92%2Fnetflix_data_analysis_sql/lists"}