{"id":22433599,"url":"https://github.com/nlitsme/youtube_tool","last_synced_at":"2025-04-13T04:09:04.549Z","repository":{"id":65822595,"uuid":"267970208","full_name":"nlitsme/youtube_tool","owner":"nlitsme","description":"Tool for extracting comments or subtitles from youtube video's","archived":false,"fork":false,"pushed_at":"2022-03-06T18:35:28.000Z","size":160,"stargazers_count":142,"open_issues_count":3,"forks_count":24,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-13T04:09:00.595Z","etag":null,"topics":["comments","python","subtitles","tool","youtube"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nlitsme.png","metadata":{"files":{"readme":"README.md","changelog":"Changelog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-05-29T22:53:10.000Z","updated_at":"2025-02-06T01:02:26.000Z","dependencies_parsed_at":"2023-02-12T11:25:11.033Z","dependency_job_id":null,"html_url":"https://github.com/nlitsme/youtube_tool","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2Fyoutube_tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2Fyoutube_tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2Fyoutube_tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nlitsme%2Fyoutube_tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nlitsme","download_url":"https://codeload.github.com/nlitsme/youtube_tool/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248661704,"owners_count":21141450,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["comments","python","subtitles","tool","youtube"],"created_at":"2024-12-05T22:15:25.229Z","updated_at":"2025-04-13T04:09:04.527Z","avatar_url":"https://github.com/nlitsme.png","language":"Python","funding_links":[],"categories":["[](#table-of-contents) Table of contents"],"sub_categories":["[](#youtube)YouTube"],"readme":"# yttool\n\nA tool for extracting info from youtube:\n * print all comments for a video\n * print a video's description + info\n * print all subtitles for a video\n * print out an entire livechat replay.\n * list all items in a playlist\n * list all videos for a channel or user\n * list all video's matching a query\n\n# install\n\nYou can install this from the official python repository using `pip`:\n\n    pip3 install youtube-tool\n\nThis will add a command `yttool` to your python binaries directory,\nand probably also to your search path. So you can run this like:\n\n    yttool ....arguments....\n\nNote: depending on your local python installation(s), you may have to type\none of `pip`, `pip3`, or maybe even: `pip3.8`.\n\n\nYou can also 'install' this by executing the `yttool.py` file directly from\nthe source directory:\n\n    python3 yttool.py  ....arguments...\n\n\n# requirements\n\nThis script needs python 3.8 or later to run.\nThe python3.8 specific feature I am using is the new `:=` walrus operator.\n\n\n# usage\n\n## list all subtitles attached to a video.\n\nThis will output the subtitles in all available languages.\n\n    yttool --subtitles https://www.youtube.com/watch?v=bJOuzqu3MUQ\n\nOr list the subtitles prefixed with timestamps\n\n    yttool -v --subtitles https://www.youtube.com/watch?v=bJOuzqu3MUQ\n\n\nYou can also extract the subtitles in a format suitable for\ncreating `.srt` subtitle files:\n\n    yttool --srt --subtitles https://www.youtube.com/watch?v=bJOuzqu3MUQ\n\n\nOr you can filter by language, for example only output the english subtitles:\n\n    yttool --language en --subtitles https://www.youtube.com/watch?v=0xY06PT5JDE\n\nOr only output the automatically generated subtitles:\n\n    yttool --language asr --subtitles https://www.youtube.com/watch?v=0xY06PT5JDE\n\n\n## comments\n\nList all the comments for this Numberphile video:\n\n    yttool --comments https://www.youtube.com/watch?v=bJOuzqu3MUQ\n\n\n## livechat replay\n\nPrint out an entire livechat replay:\n\n    yttool --replay https://www.youtube.com/watch?v=lE0u_jIDh0E\n\n## follow an active livechat\n\nNote: this does not yet work!\n\nPrint messages from a livechat as they come:\n\n    yttool --livechat https://www.youtube.com/watch?v=EEIk7gwjgIM\n\n\n## list a playlist contents.\n\nList all the video's contained in this System of a Down playlist:\n\n    yttool --playlist https://www.youtube.com/playlist?list=PLSKnqXUHTaSdXuK8Z2d-hXLFtJbRZwPtJ\n\nThe output will look like this:\n\n    CSvFpBOe8eY - System Of A Down - Chop Suey! (Official Video)\n    zUzd9KyIDrM - System Of A Down - B.Y.O.B. (Official Video)\n    L-iepu3EtyE - System Of A Down - Aerials (Official Video)\n    iywaBOMvYLI - System Of A Down - Toxicity (Official Video)\n    DnGdoEa1tPg - System Of A Down - Lonely Day (Official Video)\n    LoheCz4t2xc - System Of A Down - Hypnotize (Official Video)\n    5vBGOrI6yBk - System Of A Down - Sugar (Official Video)\n    SqZNMvIEHhs - System Of A Down - Spiders (Official Video)\n    ENBv2i88g6Y - System Of A Down - Question! (Official Video)\n    bE2r7r7VVic - System Of A Down - Boom! (Official Video)\n    F46r-_jPPHY - System Of A Down - War? (Official Video)\n\nThe first 11 characters are the video id, you can load the corresponding video\nby typing: `https://www.youtube.com/watch?v=5vBGOrI6yBk` in your browser's URL bar.\n\n\nOr list all video's from a channel:\n\n    yttool -l https://www.youtube.com/channel/UCoxcjq-8xIDTYp3uz647V5A\n\nOr when you don't know the channelid, you can get the same with the username:\n\n    yttool -l https://www.youtube.com/user/numberphile\n\n\n## list query results\n\nThis:\n\n    yttool -q somequery\n\nWill list first couple of the video's matching that query.\n\n## Just the id's\n\nYou can also call yttool with only the video id as an argument:\n\n    yttool --info CSvFpBOe8eY\n\n\n# How to use with a proxy?\n\nFor example if you would like to use TOR, you would do this:\n\n    yttool --proxy socks5://localhost:9050 --info https://www.youtube.com/watch?v=Ll-_LV9U1tA\n\nNote that setting a socks proxy via the `https_proxy` environment variable does NOT work very well with python's urllib library.\n\n\n# How does it work?\n\nThis script does not use the official youtube API, instead, it uses youtube's internal api, which is\nwhat is used on the youtube website itself. This does mean there is no guarantee that this script\nwill keep working without maintenance. Youtube will keep changing the way it works internally.\nSo I will need to keep updating this script.\n\nThe advantage of using the internal API, is that there are apparently no limits to how many requests you\ncan do. And you don't have to bother with any kind of registration.\n\n\nThese are the main internal api urls I am using:\n\n - comments: `https://www.youtube.com/comment_service_ajax`\n - livechat: `https://www.youtube.com/live_chat_replay/get_live_chat_replay`\n - search: `https://www.youtube.com/youtubei/v1/search`\n - playlists: `https://www.youtube.com/browse_ajax`\n\nAlso, you can get youtube to respond with json instead of html by adding a `\u0026pbj=1` argument to most urls,\nand add http headers: `x-youtube-client-name: 1` and `x-youtube-client-version: 2.20200603.01.00` to your request.\nAlso the user-agent header needs to be of the right format, see my script for a working example.\n\nThen, for search you need to add a `innertubeapikey`. Which I have currently hardcoded in my script, as i did with the client-version.\nA future improvement would be to automatically extract these from the current youtube front page.\n\n\n# Note about the structure of youtube video id's\n\nYoutube's id's are structured in several ways:\n\nA videoid is 11 characters long, when decoded using base64, this results in exactly 8 bytes.\nThe last character of a videoid can only be: `048AEIMQUYcgkosw`  --\u003e 10x6+4 = 64 bits\n\nA playlist id is either 24 or 34 characters long, and has the following format:\n\n### id's containing a 'playlist' id.\n\n * \"PL\u003cplaylistid\u003e\" or \"EC\u003cplaylistid\u003e\" -- custom playlist, or educational playlist.\n * \"BP\u003cplaylistid\u003e\" and \"SP\u003cplaylistid\u003e\"  also seem to have some kind of function.\n * playlistid can be:\n   * either 32 base64 characters --\u003e either a 6x32 = 192 bits\n   * or or 16 hex characters --\u003e either a 16x4 = 64 bits\n * www.youtube.com/playlist?list=PL\u003cplaylistid\u003e\n * www.youtube.com/course?list=EC\u003cplaylistid\u003e\n   * no longer works very well, the layout of the `course` page is broken,\n     with lots of overlapping text.\n\n### id's containing a channel id\n\nA channel-id is 22 base64 characters, with the last character one of: `AQgw`, so this decodes to 21x6+2 = 128 bits\n\n * \"UC\u003cchannelid\u003e\"  -- user channel\n   * www.youtube.com/channel/UC\u003cchannelid\u003e\n * \"PU\u003cchannelid\u003e\"  -- popular uploads playlist\n   * quick way to load: www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=PU\u003cchannelid\u003e\n * \"UU\u003cchannelid\u003e\"  -- user uploads playlist\n   * quick way to load: www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=UU\u003cchannelid\u003e\n * \"LL\u003cchannelid\u003e\"  -- liked video's for user\n   * quick way to load: www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=LL\u003cchannelid\u003e\n   * or www.youtube.com/playlist?list=LL\u003cchannelid\u003e\n * \"FL\u003cchannelid\u003e\"     -- favorites\n   * www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=FL\u003cchannelid\u003e\n * \"RDCMUC\u003cchannelid\u003e\" -- mix for channel\n   * www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=RDCMUC\u003cchannelid\u003e\n\n * prefixes CL, EL, MQ, TT, WL also seem to have a special meaning\n\n### Other playlist types\n\nThese take \n * \"TLGG\u003c22chars\u003e\"  -- temporary list - redir from `watch_videos`\n    * When decoded, the last 8 bytes are digits for the \"ddmmyyyy\" date.\n * \"RDEM\u003c22chars\u003e\" -- radio channel\n   * 22chars is NOT a channel-id\n   * www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=RDEM\u003c22chars\u003e\n * \"RD\u003cvideoid\u003e\"  -- mix for a specific video.\n * \"OLAK5uy_\u003c33chars\u003e\"   -- album playlist.\n   * id's start with: `klmn`  : 0b1001xx\n   * id's ends with: `AEIMQUYcgkosw048`  --\u003e 2 + 31x6 + 4 = 192 bits\n   * www.youtube.com/playlist?list=OLAK5uy_\u003c33chars\u003e\n * \"WL\"           -- 'watch later'\n   * www.youtube.com/playlist?list=WL\n   * www.youtube.com/watch?v=xxxxxxxxxxx\u0026list=WL\n * \"UL\"        -- channel video mix\n   * www.youtube.com/watch?v=\u003c11charsvidid\u003e\u0026list=ULxxxxxxxxxxx\n   * This works only when there are exactly 11 characters after 'UL'\n * \"LM\"        -- music.youtube likes\n * \"RDMM\"      -- music.youtube your mix\n * \"RDAMVM\u003cvideoid\u003e\"      -- music.youtube band mix\n * \"RDAO\u003c22chars\u003e\"\n * \"RDAMPL\" + prefix+playlistid\n * \"RDCLAK5uy_\" + 33chars\n * \"RDTMAK5uy_\" + 33chars\n\n * prefixes EL, CL also seem to have a special meaning.\n\n\n### post id's\n\n * 26 characters: Ug\u003c17chars\u003e4AaABCQ\n   * id's start with [wxyz]  : 0b1100xx\n   * id's end with [BFJNRVZdhlptx159]  : 0bxxxx01\n     -\u003e 2 + 15*6 + 4  = 96 bits\n\n# Youtube url's\n\nDomains:\n\n    youtu.be\n    youtube.com\n\nUrlPath:\n\n    /watch?v=\u003cvideoid\u003e\u0026t=123s\u0026list=\u003clistid\u003e\n    /v/\u003cvideoid\u003e\n    /embed/\u003cvideoid\u003e\n    /embed/videoseries?list=\u003cplaylistid\u003e\n    /watch/\u003cvideoid\u003e\n    /playlist?list=\u003cplaylistid\u003e\n    /channel/\u003cchannelid\u003e\n    /user/\u003cusername\u003e\n    /watch_videos?video_ids=\u003cvideoid\u003e,\u003cvideoid\u003e,...\n\n# protoc\n\nSome id's are base64 encoded protobuf packets, like: clickTrackingParams, continuation.\n\n\n# Research tool\n\nI added a tool: `ytdump.py`, which i use to investigate youtube json dictionaries.\n\n# TODO\n\n * DONE extract 'listid' from video links for playlist view.\n * DONE list a channel's video's\n * DONE list a user's video's\n * handle radio links\n * DONE extract live-chat comments\n * Filter out duplicates from the livechat replay dump.\n * DONE make my tool work with an actual live chat.\n * DONE youtube search results.\n * generalize the way continuations are used.\n * add upload date and duration in the video lists.\n * DONE automatically update the innertubeapikey and clientversion\n * get original filename from studio.youtube.com/video/\u003cvideoid\u003e/edit\n * playlist editor / organiser\n * community post listing\n * list all on video messages, like cards, etc.\n * list video markers, like in https://www.youtube.com/watch?v=i2KdE-cYMJk\n * list other videos from the same channel.\n * add time, likes to comments\n * repair the `--replay` option.\n\n\n# AUTHOR\n\nWillem Hengeveld \u003citsme@xs4all.nl\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlitsme%2Fyoutube_tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnlitsme%2Fyoutube_tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnlitsme%2Fyoutube_tool/lists"}