https://github.com/sahilmaurya28/youtube-data-analysis
YouTube Data Analysis using Python — uncovering trends, engagement patterns, and correlations between likes, comments, views, and categories to understand what drives content success.
https://github.com/sahilmaurya28/youtube-data-analysis
analysis data-analysis data-visualization matplotlib-pyplot numpy pandas portfolio-project python seaborn youtube
Last synced: 16 days ago
JSON representation
YouTube Data Analysis using Python — uncovering trends, engagement patterns, and correlations between likes, comments, views, and categories to understand what drives content success.
- Host: GitHub
- URL: https://github.com/sahilmaurya28/youtube-data-analysis
- Owner: SahilMaurya28
- License: mit
- Created: 2025-10-27T16:59:06.000Z (23 days ago)
- Default Branch: main
- Last Pushed: 2025-10-27T17:33:08.000Z (23 days ago)
- Last Synced: 2025-10-27T19:08:19.433Z (23 days ago)
- Topics: analysis, data-analysis, data-visualization, matplotlib-pyplot, numpy, pandas, portfolio-project, python, seaborn, youtube
- Language: Jupyter Notebook
- Homepage:
- Size: 21.2 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🎥 YouTube Data Analysis using Python
## 📊 Project Overview
This project explores and analyzes YouTube video data to uncover key insights into audience engagement, content performance, and viewing trends.
By examining relationships between **likes, comments, views, dislikes**, and **categories**, this analysis aims to understand what factors contribute most to audience engagement.
---
## 🎯 Objectives
- Measure and compare audience **engagement rates** across videos and categories.
- Analyze the **correlation** between likes, views, comments, and dislikes.
- Identify **top-performing categories** and channels based on engagement.
- Visualize audience patterns and relationships through clear, data-driven charts.
---
## 🧠 Key Insights
- Engagement rate does **not always increase with view count** — smaller channels can have highly active audiences.
- Strong **correlation** between likes and views suggests that quality engagement drives visibility.
- Categories like *Education*, *Pets & Animals*, and *Science & Technology* often have **higher engagement rates** despite fewer overall views.
- Most videos maintain a **high like-to-dislike ratio**, showing overall positive viewer sentiment.
---
## 🧰 Tools and Libraries
- **Python** 🐍
- **Pandas** — Data cleaning & analysis
- **Matplotlib** — Data visualization
- **Seaborn** — Advanced charts & correlation heatmaps
- **WordCloud** *(optional)* — Keyword visualization from video titles
---
## 📈 Visualizations Included
- Correlation Heatmap between key engagement metrics
- Likes vs Views Scatterplot
- Views vs Engagement Rate (Bubble Chart)
- Average Engagement per Category
- Top Channels by Engagement
- *(Optional)* Word Cloud of Trending Video Titles
---
## 📂 Dataset
The dataset used contains YouTube video statistics such as:
- `video_id`, `title`, `channel_title`, `category_title`,
- `views`, `likes`, `dislikes`, `comment_count`,
- and additional metadata like publish date and trending date.
---
## 🚀 How to Run
1. Clone this repository
```bash
git clone https://github.com//YouTube-Data-Analysis.git
cd YouTube-Data-Analysis
📚 Learning Outcomes
-Practiced data cleaning and manipulation with Pandas
-Applied correlation analysis to real-world data
-Improved data visualization and storytelling using Python
-Learned how engagement metrics interact in digital content analytics
🏁 Conclusion
This analysis reveals that audience interaction and content category play crucial roles in YouTube success — not just raw view counts.
The project emphasizes how data storytelling can help creators and analysts make data-driven decisions for content strategy.