An open API service indexing awesome lists of open source software.

https://github.com/sahilmaurya28/youtube-data-analysis

YouTube Data Analysis using Python — uncovering trends, engagement patterns, and correlations between likes, comments, views, and categories to understand what drives content success.
https://github.com/sahilmaurya28/youtube-data-analysis

analysis data-analysis data-visualization matplotlib-pyplot numpy pandas portfolio-project python seaborn youtube

Last synced: 16 days ago
JSON representation

YouTube Data Analysis using Python — uncovering trends, engagement patterns, and correlations between likes, comments, views, and categories to understand what drives content success.

Awesome Lists containing this project

README

          

# 🎥 YouTube Data Analysis using Python

## 📊 Project Overview
This project explores and analyzes YouTube video data to uncover key insights into audience engagement, content performance, and viewing trends.
By examining relationships between **likes, comments, views, dislikes**, and **categories**, this analysis aims to understand what factors contribute most to audience engagement.

---

## 🎯 Objectives
- Measure and compare audience **engagement rates** across videos and categories.
- Analyze the **correlation** between likes, views, comments, and dislikes.
- Identify **top-performing categories** and channels based on engagement.
- Visualize audience patterns and relationships through clear, data-driven charts.

---

## 🧠 Key Insights
- Engagement rate does **not always increase with view count** — smaller channels can have highly active audiences.
- Strong **correlation** between likes and views suggests that quality engagement drives visibility.
- Categories like *Education*, *Pets & Animals*, and *Science & Technology* often have **higher engagement rates** despite fewer overall views.
- Most videos maintain a **high like-to-dislike ratio**, showing overall positive viewer sentiment.

---

## 🧰 Tools and Libraries
- **Python** 🐍
- **Pandas** — Data cleaning & analysis
- **Matplotlib** — Data visualization
- **Seaborn** — Advanced charts & correlation heatmaps
- **WordCloud** *(optional)* — Keyword visualization from video titles

---

## 📈 Visualizations Included
- Correlation Heatmap between key engagement metrics
- Likes vs Views Scatterplot
- Views vs Engagement Rate (Bubble Chart)
- Average Engagement per Category
- Top Channels by Engagement
- *(Optional)* Word Cloud of Trending Video Titles

---

## 📂 Dataset
The dataset used contains YouTube video statistics such as:
- `video_id`, `title`, `channel_title`, `category_title`,
- `views`, `likes`, `dislikes`, `comment_count`,
- and additional metadata like publish date and trending date.

---

## 🚀 How to Run
1. Clone this repository
```bash
git clone https://github.com//YouTube-Data-Analysis.git
cd YouTube-Data-Analysis

📚 Learning Outcomes

-Practiced data cleaning and manipulation with Pandas

-Applied correlation analysis to real-world data

-Improved data visualization and storytelling using Python

-Learned how engagement metrics interact in digital content analytics

🏁 Conclusion

This analysis reveals that audience interaction and content category play crucial roles in YouTube success — not just raw view counts.
The project emphasizes how data storytelling can help creators and analysts make data-driven decisions for content strategy.