{"id":18035551,"url":"https://github.com/austincullar/astro","last_synced_at":"2026-04-18T00:31:27.098Z","repository":{"id":257992979,"uuid":"861022873","full_name":"AustinCullar/Astro","owner":"AustinCullar","description":"Automated data collection from YouTube.","archived":false,"fork":false,"pushed_at":"2024-10-20T18:59:28.000Z","size":68,"stargazers_count":0,"open_issues_count":5,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-20T22:57:38.644Z","etag":null,"topics":["progress-bar-python","python-logging","python3","sentiment-analysis","sqlite3-database","youtube","youtube-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AustinCullar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-21T19:40:52.000Z","updated_at":"2024-10-20T18:53:29.000Z","dependencies_parsed_at":"2024-10-20T21:57:34.211Z","dependency_job_id":null,"html_url":"https://github.com/AustinCullar/Astro","commit_stats":null,"previous_names":["austincullar/astro"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AustinCullar%2FAstro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AustinCullar%2FAstro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AustinCullar%2FAstro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AustinCullar%2FAstro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AustinCullar","download_url":"https://codeload.github.com/AustinCullar/Astro/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247263614,"owners_count":20910438,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["progress-bar-python","python-logging","python3","sentiment-analysis","sqlite3-database","youtube","youtube-api"],"created_at":"2024-10-30T12:08:38.720Z","updated_at":"2026-04-18T00:31:26.978Z","avatar_url":"https://github.com/AustinCullar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![CI build](https://github.com/AustinCullar/Astro/actions/workflows/astro-testing.yml/badge.svg) ![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)\n\n# Astro\nAutomated data collection from YouTube.\n\n## Overview\nThis project is focused on collecting data from YouTube via the Data API.\nCurrently, this tool will collect comment info from the provided YouTube video\nURL and store the data into an sqlite database file.\n\nThe comment data collected includes:\n- author\n- text\n- timestamp\n\nSentiment analysis is performed on the comment text using the 'nltk' (Natural\nLanguage Toolkit) python library. This data is added to the database entries of\neach comment.\n\n## Setup Instructions\n1. Install `python3`\n2. Create your python virtual environment with `python3 -m venv \u003cenv name\u003e`\n3. Install the packages in the `requirements.txt` file to your virtual\n   environment with `pip install -r requirements.txt`.\n4. Run `pip install -e .` to install the project packages. For developers, run\n   `pip install -e pip install -e '.[dev]'` to install all test dependencies as\n   well.\n5. Create a file in the `/src` directory called `.env`. This file should contain\n   The following values:\n    ```\n    API_KEY=\u003ckey\u003e # your YouTube Data API key\n    DB_FILE=\u003cfilename\u003e # the database file to which collected data will be written\n    LOG_LEVEL=[debug|info|warn|error]\n    ```\n    For information about how to create an API key, see [here](https://blog.hubspot.com/website/how-to-get-youtube-api-key).\n\n    Alternatively, these values can be passed to Astro on the command line. See\n    the help menu below for more information.\n    ```\n    (astro) $ python astro.py --help\n    Usage: astro.py [-h] [-l {debug,info,warn,error}] [--api-key API_KEY] [--db-file DB_FILE] [--log-file LOG_FILE]\n                    [-j | --log-json | --no-log-json]\n                    youtube_url\n\n    A tool for YouTube data collection.\n\n    Positional Arguments:\n      youtube_url           youtube video URL\n\n    Options:\n      -h, --help            show this help message and exit\n      -l, --log {debug,info,warn,error}\n                            Set the logging level (default: info)\n      --api-key API_KEY     YouTube Data API key (default: None)\n      --db-file DB_FILE     database filename (default: astro.db)\n      --log-file LOG_FILE   log output to specified file (default: astro_log.txt)\n      -j, --log-json, --no-log-json\n                            log json API responses (default: False)\n    ```\n6. Run the tool with `python astro.py \u003cYouTube video URL\u003e` to start collecting\n   data. You can see output from an example run in the next section.\n\n## Example\nThis output below was generated using a YouTube video URL from user 'hbomberguy'.\n```\n(astro) $ python astro.py 'https://www.youtube.com/watch?v=0twDETh6QaI'\n[10/26/24 11:56:52] INFO                 video_id: 0twDETh6QaI                                                       log.py:119\n                    INFO              video_title: ROBLOX_OOF.mp3                                                    log.py:119\n                    INFO               channel_id: UClt01z1wHHT7c5lKcU8pxRQ                                          log.py:119\n                    INFO            channel_title: hbomberguy                                                        log.py:119\n                    INFO               view_count: 13852769                                                          log.py:119\n                    INFO               like_count: 398474                                                            log.py:119\n                    INFO            comment_count: 46258                                                             log.py:119\n                    INFO        comments_disabled: False                                                             log.py:119\nDownloading comments 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46258/46258 • 0:03:02 • 0:00:00\nCalculating comment sentiment 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39643/39643 • 0:00:53 • 0:00:00\n[10/26/24 12:00:48] INFO     Video has filtered 14.30% of comments                                                  astro.py:91\n                                                     Comment data preview\n                                                            ...\n```\n\nBy default, Astro will log output to a file named `./astro_log.txt` unless\notherwise specified by the `--log-file` option or the `LOG_FILE` environment\nvariable.\n\n## Background\nYouTube has been a primary source of information and entertainment in my house\nfor years. I've found that when reading comments on YouTube videos, I'm often\nconfused by the content there. Wanting to understand this behavior, whether it\nwas the product of real users or bots, I started researching social media usage.\nThis project is my attempt to gather data from YouTube videos and their comments\nin order to analyze trends in the data, if any, in an effort to better\nunderstand YouTube commenting behavior and its impact on video performance.\n\nThe name 'Astro' was chosen as a short form of 'Astroturf', a term used to\ndescribe artificial social movements, since I was initially working toward\nidentifying bot campaigns. I've since decided to restrict the scope of the\nproject (at least for now), since that goal will require much more research.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faustincullar%2Fastro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faustincullar%2Fastro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faustincullar%2Fastro/lists"}