https://github.com/trannhatnguyen2/yotube-analytics-streams
An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlDB. The processed analytics data is then sent to Telegram for real-time notifications.
https://github.com/trannhatnguyen2/yotube-analytics-streams
data-engineering kafka ksqldb python telegram-bot-api youtube-api-v3
Last synced: 3 months ago
JSON representation
An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlDB. The processed analytics data is then sent to Telegram for real-time notifications.
- Host: GitHub
- URL: https://github.com/trannhatnguyen2/yotube-analytics-streams
- Owner: trannhatnguyen2
- Created: 2023-12-22T08:08:26.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-22T08:52:21.000Z (over 1 year ago)
- Last Synced: 2025-01-17T04:43:38.568Z (5 months ago)
- Topics: data-engineering, kafka, ksqldb, python, telegram-bot-api, youtube-api-v3
- Language: HTML
- Homepage:
- Size: 8.66 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Real-Time YouTube Analytics Streamed to Telegram
## Overview
An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlDB. The processed analytics data is then sent to Telegram for real-time notifications.
## System Architecture
System Architecture## Prerequisites
Before running this script, ensure you have the following installed:
- Python 3.10
- Docker, Docker Compose
- Youtube API (Google Cloud API)
- Kafka
- Confluent Containers (Zookeeper, Kafka, Schema Registry, Connect, ksqlDB, Control Center)
- Telegram Bot API (FatherBot)## Getting Started
1. **Clone the repository**:
```bash
git clone https://github.com/trannhatnguyen2/yotube-analytics-streams.git
```2. **Install Python dependencies**:
```bash
pip install -r requirements.txt
```3. **Run Docker Compose**:
```bash
docker compose -f docker-compose.yml up -d
```This command will download the necessary Docker images, create containers, and start the services in detached mode.
4. **Access the Services**
- Kafka Control Center is accessible at `http://localhost:9021`.
## Configuration
1. Open `config/config.local.example`, delete characters `.example`, and set the following:
- `YOUTUBE_API_KEY`: Your Youtube API Key in Google Cloud API
- `PLAYLIST_ID`: The Youtube playlist ID you want to track2. Set up your Kafka server address in the main script, by default, it's set to `localhost:9092`.
## How it works
1. **Fetches data from YouTube API using the given playlist ID**
```bash
python main.py
```2. **Create Stream to read and track data from Kafka topic using ksqlDB**
3. **Add Connector HttpSinkConnector**
4. **Send to Telegram Bot for real-time notifications**
---
© 2023 NhatNguyen