{"id":27219961,"url":"https://github.com/aakk23/netflix_sql_project","last_synced_at":"2025-04-10T06:50:10.825Z","repository":{"id":286101812,"uuid":"960369339","full_name":"aakk23/netflix_sql_project","owner":"aakk23","description":"This SQL project provides an analytical overview of Netflix's movies and TV shows dataset, uncovering key insights related to content types, ratings, release trends, and geographic distribution. It helps explore patterns in content availability, audience targeting, and regional preferences to support data-driven decisions.","archived":false,"fork":false,"pushed_at":"2025-04-04T10:36:43.000Z","size":1577,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-04T11:24:55.444Z","etag":null,"topics":["data-analysis","netflix-data-analysis","postgresql","sql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aakk23.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-04T10:12:08.000Z","updated_at":"2025-04-04T10:39:29.000Z","dependencies_parsed_at":"2025-04-04T11:24:57.162Z","dependency_job_id":"de7bd422-eec6-417b-959c-df266e95837f","html_url":"https://github.com/aakk23/netflix_sql_project","commit_stats":null,"previous_names":["aakk23/netflix_sql_project"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aakk23%2Fnetflix_sql_project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aakk23%2Fnetflix_sql_project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aakk23%2Fnetflix_sql_project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aakk23%2Fnetflix_sql_project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aakk23","download_url":"https://codeload.github.com/aakk23/netflix_sql_project/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248173852,"owners_count":21059595,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","netflix-data-analysis","postgresql","sql"],"created_at":"2025-04-10T06:50:08.714Z","updated_at":"2025-04-10T06:50:10.816Z","avatar_url":"https://github.com/aakk23.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📊 Netflix TV Shows and Movies Analysis with SQL\n## 🔍 Project Overview\nThis project presents an in-depth analysis of Netflix's collection of movies and TV shows using SQL. The primary objective is to uncover meaningful insights and address key business questions by exploring the dataset. This README outlines the project’s goals, data source, analytical approach, key findings, and overall conclusions.\n\n## 🎯 Project Objectives\n- Examine the distribution of content types (Movies vs. TV Shows).\n\n- Identify the most frequent ratings assigned to each type of content.\n\n- Analyze content by release year, country of origin, and duration.\n\n- Discover and classify shows and movies based on specific keywords and criteria.\n\n## 📁 Dataset Information\nThe dataset used in this project was obtained from Kaggle:\n- **Dataset Link:** [Movies Dataset](https://www.kaggle.com/datasets/shivamb/netflix-shows?resource=download\n\n## Schema\n\n```sql\nCREATE TABLE netflix\n(\n    show_id      VARCHAR(5),\n    type         VARCHAR(10),\n    title        VARCHAR(250),\n    director     VARCHAR(550),\n    casts        VARCHAR(1050),\n    country      VARCHAR(550),\n    date_added   VARCHAR(55),\n    release_year INT,\n    rating       VARCHAR(15),\n    duration     VARCHAR(15),\n    listed_in    VARCHAR(250),\n    description  VARCHAR(550)\n);\n```\n\n## Business Problems and Solutions\n\n### 1. Count the Number of Movies vs TV Shows\n\n```sql\nSELECT DISTINCT type, count(*)\nFROM netflix\nGROUP BY type;\n```\n\n**Objective:** Determine the distribution of content types on Netflix.\n\n### 2. Find the Most Common Rating for Movies and TV Shows\n\n```sql\nSELECT type,rating\nFROM (\nSELECT type,rating,count(*) as cnt,\nRANK() OVER(PARTITION BY type ORDER BY count(*) DESC) As Rank\nFROM netflix\nGROUP BY 1,2\n) as ranking\nWHERE rank=1;\n```\n\n**Objective:** Identify the most frequently occurring rating for each type of content.\n\n### 3. List All Movies Released in a Specific Year (e.g., 2020)\n\n```sql\nSELECT *\nFROM netflix\nWHERE release_year =2020 AND type = 'Movie';\n```\n\n**Objective:** Retrieve all movies released in a specific year.\n\n### 4. Find the Top 5 Countries with the Most Content on Netflix\n\n```sql\nSELECT country,count(show_id) AS content_count\nFROM netflix\nWHERE Country IS NOT NULL\nGROUP BY 1 ORDER BY 2 DESC LIMIT 5;\n```\n\n**Objective:** Identify the top 5 countries with the highest number of content items.\n\n### 5. Identify the Longest Movie\n\n```sql\nSELECT *\nFROM netflix\nWHERE type ='Movie' AND duration =(SELECT MAX(duration) FROM netflix);\n```\n\n**Objective:** Find the movie with the longest duration.\n\n### 6. Find Content Added in the Last 5 Years\n\n```sql\nSELECT *\nFROM netflix\nWHERE TO_DATE(date_added,'MONTH DD,YYYY') \u003e= CURRENT_DATE - INTERVAL '5 years';\n```\n\n**Objective:** Retrieve content added to Netflix in the last 5 years.\n\n### 7. Find All Movies/TV Shows by Director 'Rajiv Chilaka'\n\n```sql\nSELECT *\nFROM netflix\nWHERE director LIKE '%Rajiv Chilaka%';\n```\n\n**Objective:** List all content directed by 'Rajiv Chilaka'.\n\n### 8. List All TV Shows with More Than 5 Seasons\n\n```sql\nSELECT *\nFROM netflix\nWHERE type='TV Show' AND duration\u003e'5 seasons';\n```\n\n**Objective:** Identify TV shows with more than 5 seasons.\n\n### 9. Count the Number of Content Items in Each Genre\n\n```sql\nSELECT\nUNNEST(STRING_TO_ARRAY(listed_in,',')) as genre, count(show_id) as total_content\nFROM netflix\nGROUP BY 1 ORDER BY 2 DESC\n;\n```\n\n**Objective:** Count the number of content items in each genre.\n\n### 10.Find each year and the average numbers of content release in India on netflix. \nreturn top 5 year with highest avg content release!\n\n```sql\nSELECT \n\tEXTRACT(YEAR FROM TO_DATE(date_added,'Month DD,YYYY')) AS year, \n\tcount(*) as total_release,\n\tROUND(\n\tcount(*)::numeric/(SELECT count(*) FROM netflix WHERE country LIKE '%India%')::numeric*100\n\t,2) as avg_release\nFROM netflix\nWHERE country LIKE '%India%'\nGROUP BY 1 ORDER BY 3 DESC LIMIT 5;\n```\n\n**Objective:** Calculate and rank years by the average number of content releases by India.\n\n### 11. List All Movies that are Documentaries\n\n```sql\nSELECT *\nFROM netflix\nWHERE listed_in LIKE '%Documentaries%' AND type = 'Movie';\n```\n\n**Objective:** Retrieve all movies classified as documentaries.\n\n### 12. Find All Content Without a Director\n\n```sql\nSELECT *\nFROM netflix\nWHERE director IS NULL;\n```\n\n**Objective:** List content that does not have a director.\n\n### 13. Find How Many Movies Actor 'Salman Khan' Appeared in the Last 10 Years\n\n```sql\nSELECT *\nFROM netflix\nWHERE casts LIKE '%Salman Khan%' AND release_year \u003e EXTRACT(YEAR FROM CURRENT_DATE) - 10;\n```\n\n**Objective:** Count the number of movies featuring 'Salman Khan' in the last 10 years.\n\n### 14. Find the Top 10 Actors Who Have Appeared in the Highest Number of Movies Produced in India\n\n```sql\nSELECT UNNEST(STRING_TO_ARRAY(casts,',')),count(show_id)\nFROM netflix\nWHERE country LIKE '%India%'\nGROUP BY 1 ORDER BY 2 DESC LIMIT 10;\n```\n\n**Objective:** Identify the top 10 actors with the most appearances in Indian-produced movies.\n\n### 15. Categorize Content Based on the Presence of 'Kill' and 'Violence' Keywords\n\n```sql\nWITH label AS(\nSELECT show_id,title, \n\tCASE\n\t\tWHEN\n\t\tdescription ILIKE '% kill %' OR \n\t\tdescription ILIKE '% violence %' THEN 'Bad_Content'\n\t\tELSE 'Good_Content'\n\tEND AS content_label\nFROM netflix\n)\n\nSELECT content_label, Count(*) as total_content\nFROM label\nGROUP BY 1;\n```\n\n**Objective:** Categorize content as 'Bad' if it contains 'kill' or 'violence' and 'Good' otherwise. Count the number of items in each category.\n## 📈 Findings \u0026 Conclusion\n- **Content Distribution:** The dataset showcases a broad mix of movies and TV shows, reflecting a wide variety of genres and content types available on Netflix.\n\n- **Popular Ratings:** Analysis of the most frequent ratings offers insights into the platform’s content positioning and its intended target audiences.\n\n- **Geographical Trends:** The dominance of countries like the United States and India in terms of content volume reveals regional preferences and Netflix’s global content strategy.\n\n- **Content Categorization:** Filtering content based on keywords helps identify trends and patterns in show themes, genres, and formats.\n\nOverall, this analysis offers a data-driven perspective on Netflix’s content library, which can support strategic decisions related to content planning, user targeting, and regional expansion.\n\n## 📌 Author: Aakkash Aswin\nThis project is a part of my data analytics portfolio and highlights my SQL proficiency relevant to data analyst roles.\n### Connect with me on [LinkedIn](http://www.linkedin.com/in/aakkash-aswin)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faakk23%2Fnetflix_sql_project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faakk23%2Fnetflix_sql_project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faakk23%2Fnetflix_sql_project/lists"}