https://github.com/paulo-granthon/yt-comment-dumper
ETL + flask server for dumping comments from youtube videos
https://github.com/paulo-granthon/yt-comment-dumper
etl flask mongo youtool youtube-api
Last synced: about 2 months ago
JSON representation
ETL + flask server for dumping comments from youtube videos
- Host: GitHub
- URL: https://github.com/paulo-granthon/yt-comment-dumper
- Owner: paulo-granthon
- Created: 2025-06-03T23:07:23.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-07-07T06:02:26.000Z (12 months ago)
- Last Synced: 2025-07-07T06:40:01.106Z (12 months ago)
- Topics: etl, flask, mongo, youtool, youtube-api
- Language: Python
- Homepage:
- Size: 22.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# yt-comment-dumper
ETL + flask server for dumping comments from youtube videos
https://github.com/user-attachments/assets/bc36bdcd-c918-4315-935c-e96dcb1c1dd9
```rb
make VIDEO=https://www.youtube.com/watch?v=IH3W4WrM8qc
```
Projects:
## ETL
- ***Extract***s the comments from the youtube video, along with data such as the number of likes, the `parent_id` (in case of a response comment) and the user's profile picture url;
- ***Transform***s the data from the provider structure into the document structure used in the database;
- ***Load***s the formatted data into a `MongoDB` collection.
## Server
- Provides a tree view page with the comments present in the collection.
Environment:
#### `yt-comment-dumper` will only work if there's a properly configured `.env` file in the root of the repository directory.
- Copy the `.env.example` file into a `.env` file and set your youtube api key:
> .env:
```rb
# shellcheck disable=2034,2148
# YouTube API keys
YOUTUBE_API_KEY=your_api_key_here
[...]
```
- If you are not using the pre-configured MongoDB container of this repository, you'll also need to change the environment variables prefixed with `DB_` in your `.env` file:
> .env:
```rb
[...]
# MongoDB configuration
DB_HOST=localhost
DB_PORT=27017
DB_NAME=yt_comment_dumper
DB_COLLECTION=comments
DB_USERNAME=mongo
DB_PASSWORD=secret
```
Usage:
## Makefile:
It's possible to run everything using the Makefile commands.
```yaml
make VIDEO= # Runs everything (db, etl, server)
```
Default commands:
```yaml
make db-up # starts the mongodb container
make prep # installs poetry dependencies
make etl # runs the ETL
make serve # runs the server
```
End of session:
```yaml
make db-down # stops the mongodb container
```
## poetry:
Install project dependencies:
```yaml
poetry lock
poetry install
```
Run projects:
```yaml
poetry run python src/etl/main.py VIDEO= # runs the ETL
poetry run python src/server/app.py # runs server
```
## docker-compose:
MongoDB container management (or use yours)
```yaml
docker-compose up # start mongodb container
docker-compose down # stop mongodb container
```