https://github.com/jotstolu/netflix-sql-data-analysis-project
This project explores the Netflix dataset using SQL queries to uncover trends, patterns, and business insights that could help stakeholders understand content distribution, viewer preferences, and platform optimization
https://github.com/jotstolu/netflix-sql-data-analysis-project
data-analysis sql sql-server tsql
Last synced: 6 months ago
JSON representation
This project explores the Netflix dataset using SQL queries to uncover trends, patterns, and business insights that could help stakeholders understand content distribution, viewer preferences, and platform optimization
- Host: GitHub
- URL: https://github.com/jotstolu/netflix-sql-data-analysis-project
- Owner: jotstolu
- Created: 2025-07-19T05:36:50.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-07-19T07:01:22.000Z (6 months ago)
- Last Synced: 2025-07-19T10:37:58.407Z (6 months ago)
- Topics: data-analysis, sql, sql-server, tsql
- Homepage:
- Size: 609 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🎬 Netflix SQL Data Analysis Project
This project explores the **Netflix dataset** using **SQL queries** to uncover trends, patterns, and business insights that could help stakeholders understand content distribution, viewer preferences, and platform optimization.
## 📊 Project Objective
The goal is to answer key business questions such as:
- What is the ratio of movies to TV shows?
- Which countries and genres dominate Netflix's catalog?
- Who are the most featured actors or directors?
- How has Netflix’s content evolved in recent years?
## 🧰 Tools & Technologies
- **SQL Server / T-SQL**
- **Netflix dataset** (in `.csv` format)
## 🔍 Business Questions Answered
1. **Count the Number of Movies vs TV Shows**
2. **Find the Most Common Rating for Movies and TV Shows**
3. **List All Movies Released in the Year 2021**
4. **Top 5 Countries with the Most Content on Netflix**
5. **Identify the Longest Movie**
6. **Find Content Added in the Last 5 Years**
7. **Find All Movies/TV Shows by Director 'Rajiv Chilaka'**
8. **List All TV Shows with More Than 5 Seasons**
9. **Count the Number of Content Items in Each Genre**
10. **Average Yearly Content Released in India**
11. **List All Movies that are Documentaries**
12. **Find All Content Without a Director**
13. **Find How Many Movies Actor 'Salman Khan' Appeared in the Last 10 Years**
14. **Top 10 Actors with Most Movies Produced in India**
15. **Categorize Content Based on the Presence of 'Kill' and 'Violence' Keywords**
## SQL Queries
**1. Count the Number of Movies vs TV Shows**
```sql
SELECT type, COUNT(type) AS total_count
FROM netflix_tb
GROUP BY type;
```

**2. Most Common Rating for Movies and TV Shows**
```sql
WITH common_rating AS (
SELECT type, rating, COUNT(*) AS total_count,
RANK() OVER (PARTITION BY type ORDER BY COUNT(*) DESC) AS rank
FROM netflix_tb
GROUP BY type, rating
)
SELECT type, rating, total_count FROM common_rating WHERE rank = 1;
```

**3. All Movies Released in 2021**
```sql
SELECT title, type, release_year
FROM netflix_tb
WHERE type = 'Movie' AND release_year = 2021;
```

**4. Top 5 Countries with the Most Content**
```sql
SELECT TOP (5) TRIM(value) AS country, COUNT(*) AS count
FROM netflix_tb
CROSS APPLY STRING_SPLIT(country, ',')
GROUP BY TRIM(value)
ORDER BY count DESC;
```

**5. Identify the Longest Movie**
```sql
SELECT title, duration
FROM netflix_tb
WHERE type = 'Movie'
AND duration = (SELECT MAX(duration) FROM netflix_tb);
```

**6. Content Added in the Last 5 Years**
```sql
SELECT title, date_added
FROM netflix_tb
WHERE date_added >= DATEADD(YEAR, -5, GETDATE());
```

**7. Movies/TV Shows by Director 'Rajiv Chilaka**
```sql
SELECT type, title, director
FROM netflix_tb
WHERE director LIKE '%Rajiv Chilaka%';
```

**8. TV Shows with More Than 5 Seasons**
```sql
SELECT title, type, duration
FROM netflix_tb
WHERE type = 'TV Show' AND duration > '5 Seasons';
```

**9. Number of Content Items per Genre**
```sql
SELECT TRIM(value) AS genre, COUNT(*) AS total_content
FROM netflix_tb
CROSS APPLY STRING_SPLIT(listed_in, ',')
GROUP BY TRIM(value)
ORDER BY COUNT(*) DESC;
```

**10. Average Yearly Content Released in India**
```sql
SELECT
YEAR(date_added) AS year,
COUNT(*) AS total_content,
ROUND(
CAST(COUNT(*) AS NUMERIC) * 100.0 /
CAST((SELECT COUNT(*) FROM netflix_tb WHERE country = 'India') AS NUMERIC),
2
) AS avg_content_year
FROM netflix_tb
WHERE country = 'India'
GROUP BY YEAR(date_added)
ORDER BY COUNT(*) DESC;
```

**11. All Movies that are Documentaries**
```sql
SELECT title, type, listed_in
FROM netflix_tb
WHERE type = 'Movie' AND listed_in LIKE '%Documentaries%';
```

**12. All Content Without a Director**
```sql
SELECT type, director
FROM netflix_tb
WHERE director IS NULL;
```

**13. Movies Featuring 'Salman Khan' in Last 10 Years**
```sql
SELECT cast, title, release_year
FROM netflix_tb
WHERE cast LIKE '%Salman Khan%'
AND release_year > YEAR(GETDATE()) - 10;
```

**14. Top 10 Actors in Indian Movies**
```sql
SELECT TOP 10
TRIM(value) AS actor,
COUNT(*) AS appearances
FROM netflix_tb
CROSS APPLY STRING_SPLIT(cast, ',')
WHERE country = 'India' AND cast IS NOT NULL
GROUP BY TRIM(value)
ORDER BY COUNT(*) DESC;
```

**15. Content Categorized by 'Kill' or 'Violence' Keywords**
```sql
SELECT category, COUNT(*) AS content_count
FROM (
SELECT
CASE
WHEN description LIKE '%kill%' OR description LIKE '%violence%' THEN 'Bad'
ELSE 'Good'
END AS category
FROM netflix_tb
) AS categorized_content
GROUP BY category;
```

---