{"id":13585583,"url":"https://github.com/jdepoix/youtube-transcript-api","last_synced_at":"2026-01-29T11:08:30.301Z","repository":{"id":37735159,"uuid":"130369089","full_name":"jdepoix/youtube-transcript-api","owner":"jdepoix","description":"This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!","archived":false,"fork":false,"pushed_at":"2025-10-13T15:55:47.000Z","size":1870,"stargazers_count":6495,"open_issues_count":18,"forks_count":681,"subscribers_count":47,"default_branch":"master","last_synced_at":"2025-11-27T13:36:55.408Z","etag":null,"topics":["asr","captions","cli","python","subtitle","subtitles","transcript","transcripts","translating-transcripts","youtube","youtube-api","youtube-asr","youtube-captions","youtube-subtitles","youtube-transcript","youtube-transcripts","youtube-video"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jdepoix.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"jdepoix","custom":"https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=BAENLEW8VUJ6G\u0026source=url"}},"created_at":"2018-04-20T13:55:04.000Z","updated_at":"2025-11-27T10:14:21.000Z","dependencies_parsed_at":"2022-07-17T00:46:14.888Z","dependency_job_id":"3bd6c0e9-1a5a-4a3c-8ccf-2974e3212bbe","html_url":"https://github.com/jdepoix/youtube-transcript-api","commit_stats":{"total_commits":231,"total_committers":20,"mean_commits":11.55,"dds":0.6493506493506493,"last_synced_commit":"97522b7921a653e20276ca27c162e9de746f7838"},"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"purl":"pkg:github/jdepoix/youtube-transcript-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jdepoix%2Fyoutube-transcript-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jdepoix%2Fyoutube-transcript-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jdepoix%2Fyoutube-transcript-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jdepoix%2Fyoutube-transcript-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jdepoix","download_url":"https://codeload.github.com/jdepoix/youtube-transcript-api/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jdepoix%2Fyoutube-transcript-api/sbom","scorecard":{"id":512433,"data":{"date":"2025-08-11","repo":{"name":"github.com/jdepoix/youtube-transcript-api","commit":"63eeec2604de4e6ae8c985212a7b01bd6b519790"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4.7,"checks":[{"name":"Code-Review","score":2,"reason":"Found 3/11 approved changesets -- score normalized to 2","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":10,"reason":"30 commit(s) and 29 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/ci.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:15: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:17: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:37: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:39: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:50: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:63: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:67: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:69: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:85: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/ci.yml:87: update your workflow using https://app.stepsecurity.io/secureworkflow/jdepoix/youtube-transcript-api/ci.yml/master?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:22","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:44","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:74","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:92","Info:   0 out of   8 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 third-party GitHubAction dependencies pinned","Info:   0 out of   4 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":7,"reason":"3 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-9hjg-9r4m-mvj7","Warn: Project is vulnerable to: GHSA-48p4-8xcf-vxj5","Warn: Project is vulnerable to: GHSA-pq67-6m6q-mj2v"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 29 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-20T00:56:26.732Z","repository_id":37735159,"created_at":"2025-08-20T00:56:26.732Z","updated_at":"2025-08-20T00:56:26.732Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28876674,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-29T10:31:27.438Z","status":"ssl_error","status_checked_at":"2026-01-29T10:31:01.017Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","captions","cli","python","subtitle","subtitles","transcript","transcripts","translating-transcripts","youtube","youtube-api","youtube-asr","youtube-captions","youtube-subtitles","youtube-transcript","youtube-transcripts","youtube-video"],"created_at":"2024-08-01T15:05:01.621Z","updated_at":"2026-01-29T11:08:30.291Z","avatar_url":"https://github.com/jdepoix.png","language":"Python","funding_links":["https://github.com/sponsors/jdepoix","https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=BAENLEW8VUJ6G\u0026source=url"],"categories":["cli","Python","[](#table-of-contents) Table of contents","python","Repos","网络信息服务","🌐 Web Development - Frontend"],"sub_categories":["[](#youtube)YouTube","资源传输下载"],"readme":"\u003ch1 align=\"center\"\u003e\r\n  ✨ YouTube Transcript API ✨\r\n\u003c/h1\u003e\r\n\r\n\u003cp align=\"center\"\u003e\r\n  \u003ca href=\"https://github.com/sponsors/jdepoix\"\u003e\r\n    \u003cimg src=\"https://img.shields.io/static/v1?label=Sponsor\u0026message=%E2%9D%A4\u0026logo=GitHub\u0026color=%23fe8e86\" alt=\"Sponsor\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=BAENLEW8VUJ6G\u0026source=url\"\u003e\r\n    \u003cimg src=\"https://img.shields.io/badge/Donate-PayPal-green.svg\" alt=\"Donate\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"https://github.com/jdepoix/youtube-transcript-api/actions\"\u003e\r\n    \u003cimg src=\"https://github.com/jdepoix/youtube-transcript-api/actions/workflows/ci.yml/badge.svg?branch=master\" alt=\"Build Status\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"https://coveralls.io/github/jdepoix/youtube-transcript-api?branch=master\"\u003e\r\n    \u003cimg src=\"https://coveralls.io/repos/github/jdepoix/youtube-transcript-api/badge.svg?branch=master\" alt=\"Coverage Status\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"http://opensource.org/licenses/MIT\"\u003e\r\n    \u003cimg src=\"http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat\" alt=\"MIT license\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"https://pypi.org/project/youtube-transcript-api/\"\u003e\r\n    \u003cimg src=\"https://img.shields.io/pypi/v/youtube-transcript-api.svg\" alt=\"Current Version\"\u003e\r\n  \u003c/a\u003e\r\n  \u003ca href=\"https://pypi.org/project/youtube-transcript-api/\"\u003e\r\n    \u003cimg src=\"https://img.shields.io/pypi/pyversions/youtube-transcript-api.svg\" alt=\"Supported Python Versions\"\u003e\r\n  \u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n\u003cp align=\"center\"\u003e\r\n  \u003cb\u003eThis is a python API which allows you to retrieve the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles, supports translating subtitles and it does not require a headless browser, like other selenium based solutions do!\u003c/b\u003e\r\n\u003c/p\u003e\r\n\u003cp align=\"center\"\u003e\r\n Maintenance of this project is made possible by all the \u003ca href=\"https://github.com/jdepoix/youtube-transcript-api/graphs/contributors\"\u003econtributors\u003c/a\u003e and \u003ca href=\"https://github.com/sponsors/jdepoix\"\u003esponsors\u003c/a\u003e. If you'd like to sponsor this project and have your avatar or company logo appear below \u003ca href=\"https://github.com/sponsors/jdepoix\"\u003eclick here\u003c/a\u003e. 💖\r\n\u003c/p\u003e\r\n\r\n\u003cp align=\"center\"\u003e\r\n  \u003ca href=\"https://www.searchapi.io\"\u003e\r\n    \u003cpicture\u003e\r\n      \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://www.searchapi.io/press/v1/svg/searchapi_logo_white_h.svg\"\u003e\r\n      \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://www.searchapi.io/press/v1/svg/searchapi_logo_black_h.svg\"\u003e\r\n      \u003cimg alt=\"SearchAPI\" src=\"https://www.searchapi.io/press/v1/svg/searchapi_logo_black_h.svg\" height=\"40px\" style=\"vertical-align: middle;\"\u003e\r\n    \u003c/picture\u003e\r\n  \u003c/a\u003e\r\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\r\n  \u003ca href=\"https://supadata.ai\"\u003e\r\n    \u003cpicture\u003e\r\n      \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://supadata.ai/logo-dark.svg\"\u003e\r\n      \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://supadata.ai/logo-light.svg\"\u003e\r\n      \u003cimg alt=\"supadata\" src=\"https://supadata.ai/logo-light.svg\" height=\"40px\"\u003e\r\n    \u003c/picture\u003e\r\n  \u003c/a\u003e\r\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\r\n  \u003ca href=\"https://www.dumplingai.com\"\u003e\r\n    \u003cpicture\u003e\r\n      \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://www.dumplingai.com/logos/logo-dark.svg\"\u003e\r\n      \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://www.dumplingai.com/logos/logo-light.svg\"\u003e\r\n      \u003cimg alt=\"Dumpling AI\" src=\"https://www.dumplingai.com/logos/logo-light.svg\" height=\"40px\" style=\"vertical-align: middle;\"\u003e\r\n    \u003c/picture\u003e\r\n  \u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n## Install\r\n\r\nIt is recommended to [install this module by using pip](https://pypi.org/project/youtube-transcript-api/):\r\n\r\n```\r\npip install youtube-transcript-api\r\n```\r\n\r\nYou can either integrate this module [into an existing application](#api) or just use it via a [CLI](#cli).\r\n\r\n## API\r\n\r\nThe easiest way to get a transcript for a given video is to execute:\r\n\r\n```python\r\nfrom youtube_transcript_api import YouTubeTranscriptApi\r\n\r\nytt_api = YouTubeTranscriptApi()\r\nytt_api.fetch(video_id)\r\n```\r\n\r\n\u003e **Note:** By default, this will try to access the English transcript of the video. If your video has a different \r\n\u003e language, or you are interested in fetching a transcript in a different language, please read the section below.\r\n\r\n\u003e **Note:** Pass in the video ID, NOT the video URL. For a video with the URL `https://www.youtube.com/watch?v=12345` \r\n\u003e the ID is `12345`.\r\n\r\nThis will return a `FetchedTranscript` object looking somewhat like this:\r\n\r\n```python\r\nFetchedTranscript(\r\n    snippets=[\r\n        FetchedTranscriptSnippet(\r\n            text=\"Hey there\",\r\n            start=0.0,\r\n            duration=1.54,\r\n        ),\r\n        FetchedTranscriptSnippet(\r\n            text=\"how are you\",\r\n            start=1.54,\r\n            duration=4.16,\r\n        ),\r\n        # ...\r\n    ],\r\n    video_id=\"12345\",\r\n    language=\"English\",\r\n    language_code=\"en\",\r\n    is_generated=False,\r\n)\r\n```\r\n\r\nThis object implements most interfaces of a `List`:\r\n\r\n```python\r\nytt_api = YouTubeTranscriptApi()\r\nfetched_transcript = ytt_api.fetch(video_id)\r\n\r\n# is iterable\r\nfor snippet in fetched_transcript:\r\n    print(snippet.text)\r\n\r\n# indexable\r\nlast_snippet = fetched_transcript[-1]\r\n\r\n# provides a length\r\nsnippet_count = len(fetched_transcript)\r\n```\r\n\r\nIf you prefer to handle the raw transcript data you can call `fetched_transcript.to_raw_data()`, which will return \r\na list of dictionaries:\r\n\r\n```python\r\n[\r\n    {\r\n        'text': 'Hey there',\r\n        'start': 0.0,\r\n        'duration': 1.54\r\n    },\r\n    {\r\n        'text': 'how are you',\r\n        'start': 1.54\r\n        'duration': 4.16\r\n    },\r\n    # ...\r\n]\r\n```\r\n### Retrieve different languages\r\n\r\nYou can add the `languages` param if you want to make sure the transcripts are retrieved in your desired language \r\n(it defaults to english).\r\n\r\n```python\r\nYouTubeTranscriptApi().fetch(video_id, languages=['de', 'en'])\r\n```\r\n\r\nIt's a list of language codes in a descending priority. In this example it will first try to fetch the german \r\ntranscript (`'de'`) and then fetch the english transcript (`'en'`) if it fails to do so. If you want to find out \r\nwhich languages are available first, [have a look at `list()`](#list-available-transcripts).\r\n\r\nIf you only want one language, you still need to format the `languages` argument as a list\r\n\r\n```python\r\nYouTubeTranscriptApi().fetch(video_id, languages=['de'])\r\n```\r\n\r\n### Preserve formatting\r\n\r\nYou can also add `preserve_formatting=True` if you'd like to keep HTML formatting elements such as `\u003ci\u003e` (italics) \r\nand `\u003cb\u003e` (bold).\r\n\r\n```python\r\nYouTubeTranscriptApi().fetch(video_ids, languages=['de', 'en'], preserve_formatting=True)\r\n```\r\n\r\n### List available transcripts\r\n\r\nIf you want to list all transcripts which are available for a given video you can call:\r\n\r\n```python\r\nytt_api = YouTubeTranscriptApi()\r\ntranscript_list = ytt_api.list(video_id)\r\n```\r\n\r\nThis will return a `TranscriptList` object which is iterable and provides methods to filter the list of transcripts for \r\nspecific languages and types, like:\r\n\r\n```python\r\ntranscript = transcript_list.find_transcript(['de', 'en'])\r\n```\r\n\r\nBy default this module always chooses manually created transcripts over automatically created ones, if a transcript in \r\nthe requested language is available both manually created and generated. The `TranscriptList` allows you to bypass this \r\ndefault behaviour by searching for specific transcript types:\r\n\r\n```python\r\n# filter for manually created transcripts\r\ntranscript = transcript_list.find_manually_created_transcript(['de', 'en'])\r\n\r\n# or automatically generated ones\r\ntranscript = transcript_list.find_generated_transcript(['de', 'en'])\r\n```\r\n\r\nThe methods `find_generated_transcript`, `find_manually_created_transcript`, `find_transcript` return `Transcript` \r\nobjects. They contain metadata regarding the transcript:\r\n\r\n```python\r\nprint(\r\n    transcript.video_id,\r\n    transcript.language,\r\n    transcript.language_code,\r\n    # whether it has been manually created or generated by YouTube\r\n    transcript.is_generated,\r\n    # whether this transcript can be translated or not\r\n    transcript.is_translatable,\r\n    # a list of languages the transcript can be translated to\r\n    transcript.translation_languages,\r\n)\r\n```\r\n\r\nand provide the method, which allows you to fetch the actual transcript data:\r\n\r\n```python\r\ntranscript.fetch()\r\n```\r\n\r\nThis returns a `FetchedTranscript` object, just like `YouTubeTranscriptApi().fetch()` does.\r\n\r\n### Translate transcript\r\n\r\nYouTube has a feature which allows you to automatically translate subtitles. This module also makes it possible to \r\naccess this feature. To do so `Transcript` objects provide a `translate()` method, which returns a new translated \r\n`Transcript` object:\r\n\r\n```python\r\ntranscript = transcript_list.find_transcript(['en'])\r\ntranslated_transcript = transcript.translate('de')\r\nprint(translated_transcript.fetch())\r\n```\r\n\r\n### By example\r\n```python\r\nfrom youtube_transcript_api import YouTubeTranscriptApi\r\n\r\nytt_api = YouTubeTranscriptApi()\r\n\r\n# retrieve the available transcripts\r\ntranscript_list = ytt_api.list('video_id')\r\n\r\n# iterate over all available transcripts\r\nfor transcript in transcript_list:\r\n\r\n    # the Transcript object provides metadata properties\r\n    print(\r\n        transcript.video_id,\r\n        transcript.language,\r\n        transcript.language_code,\r\n        # whether it has been manually created or generated by YouTube\r\n        transcript.is_generated,\r\n        # whether this transcript can be translated or not\r\n        transcript.is_translatable,\r\n        # a list of languages the transcript can be translated to\r\n        transcript.translation_languages,\r\n    )\r\n\r\n    # fetch the actual transcript data\r\n    print(transcript.fetch())\r\n\r\n    # translating the transcript will return another transcript object\r\n    print(transcript.translate('en').fetch())\r\n\r\n# you can also directly filter for the language you are looking for, using the transcript list\r\ntranscript = transcript_list.find_transcript(['de', 'en'])  \r\n\r\n# or just filter for manually created transcripts  \r\ntranscript = transcript_list.find_manually_created_transcript(['de', 'en'])  \r\n\r\n# or automatically generated ones  \r\ntranscript = transcript_list.find_generated_transcript(['de', 'en'])\r\n```\r\n\r\n## Working around IP bans (`RequestBlocked` or `IpBlocked` exception)\r\n\r\nUnfortunately, YouTube has started blocking most IPs that are known to belong to cloud providers (like AWS, Google Cloud \r\nPlatform, Azure, etc.), which means you will most likely run into `RequestBlocked` or `IpBlocked` exceptions when \r\ndeploying your code to any cloud solutions. Same can happen to the IP of your self-hosted solution, if you are doing \r\ntoo many requests. You can work around these IP bans using proxies. However, since YouTube will ban static proxies \r\nafter extended use, going for rotating residential proxies provide is the most reliable option.\r\n\r\nThere are different providers that offer rotating residential proxies, but after testing different \r\nofferings I have found [Webshare](https://www.webshare.io/?referral_code=w0xno53eb50g) to be the most reliable and have \r\ntherefore integrated it into this module, to make setting it up as easy as possible.\r\n\r\n### Using [Webshare](https://www.webshare.io/?referral_code=w0xno53eb50g)\r\n\r\nOnce you have created a [Webshare account](https://www.webshare.io/?referral_code=w0xno53eb50g) and purchased a \r\n\"Residential\" proxy package that suits your workload (make sure NOT to purchase \"Proxy Server\" or \r\n\"Static Residential\"!), open the \r\n[Webshare Proxy Settings](https://dashboard.webshare.io/proxy/settings?referral_code=w0xno53eb50g) to retrieve \r\nyour \"Proxy Username\" and \"Proxy Password\". Using this information you can initialize the `YouTubeTranscriptApi` as \r\nfollows:\r\n\r\n```python\r\nfrom youtube_transcript_api import YouTubeTranscriptApi\r\nfrom youtube_transcript_api.proxies import WebshareProxyConfig\r\n\r\nytt_api = YouTubeTranscriptApi(\r\n    proxy_config=WebshareProxyConfig(\r\n        proxy_username=\"\u003cproxy-username\u003e\",\r\n        proxy_password=\"\u003cproxy-password\u003e\",\r\n    )\r\n)\r\n\r\n# all requests done by ytt_api will now be proxied through Webshare\r\nytt_api.fetch(video_id)\r\n```\r\n\r\nUsing the `WebshareProxyConfig` will default to using rotating residential proxies and requires no further \r\nconfiguration.\r\n\r\nYou can also limit the pool of IPs that you will be rotating through to those located in specific countries. By \r\nchoosing locations that are close to the machine that is running your code, you can reduce latency. Also, this \r\ncan be used to work around location-based restrictions. \r\n\r\n```python\r\nytt_api = YouTubeTranscriptApi(\r\n    proxy_config=WebshareProxyConfig(\r\n        proxy_username=\"\u003cproxy-username\u003e\",\r\n        proxy_password=\"\u003cproxy-password\u003e\",\r\n        filter_ip_locations=[\"de\", \"us\"],\r\n    )\r\n)\r\n\r\n# Webshare will now only rotate through IPs located in Germany or the United States!\r\nytt_api.fetch(video_id)\r\n```\r\n\r\nYou can find the \r\nfull list of available locations (and how many IPs are available in each location) \r\n[here](https://www.webshare.io/features/proxy-locations?referral_code=w0xno53eb50g).\r\n\r\nNote that [referral links are used here](https://www.webshare.io/?referral_code=w0xno53eb50g) and any purchases \r\nmade through these links will support this Open Source project (at no additional cost of course!), which is very much \r\nappreciated! 💖😊🙏💖\r\n\r\nHowever, you are of course free to integrate your own proxy solution using the `GenericProxyConfig` class, if you \r\nprefer using another provider or want to implement your own solution, as covered by the following section.\r\n\r\n### Using other Proxy solutions\r\n\r\nAlternatively to using [Webshare](#using-webshare), you can set up any generic HTTP/HTTPS/SOCKS proxy using the \r\n`GenericProxyConfig` class:\r\n\r\n```python\r\nfrom youtube_transcript_api import YouTubeTranscriptApi\r\nfrom youtube_transcript_api.proxies import GenericProxyConfig\r\n\r\nytt_api = YouTubeTranscriptApi(\r\n    proxy_config=GenericProxyConfig(\r\n        http_url=\"http://user:pass@my-custom-proxy.org:port\",\r\n        https_url=\"https://user:pass@my-custom-proxy.org:port\",\r\n    )\r\n)\r\n\r\n# all requests done by ytt_api will now be proxied using the defined proxy URLs\r\nytt_api.fetch(video_id)\r\n```\r\n\r\nBe aware that using a proxy doesn't guarantee that you won't be blocked, as YouTube can always block the IP of your \r\nproxy! Therefore, you should always choose a solution that rotates through a pool of proxy addresses, if you want to\r\nmaximize reliability.\r\n\r\n## Overwriting request defaults\r\n\r\nWhen initializing a `YouTubeTranscriptApi` object, it will create a `requests.Session` which will be used for all\r\nHTTP(S) request. This allows for caching cookies when retrieving multiple requests. However, you can optionally pass a\r\n`requests.Session` object into its constructor, if you manually want to share cookies between different instances of\r\n`YouTubeTranscriptApi`, overwrite defaults, set custom headers, specify SSL certificates, etc.\r\n\r\n```python\r\nfrom requests import Session\r\n\r\nhttp_client = Session()\r\n\r\n# set custom header\r\nhttp_client.headers.update({\"Accept-Encoding\": \"gzip, deflate\"})\r\n\r\n# set path to CA_BUNDLE file\r\nhttp_client.verify = \"/path/to/certfile\"\r\n\r\nytt_api = YouTubeTranscriptApi(http_client=http_client)\r\nytt_api.fetch(video_id)\r\n\r\n# share same Session between two instances of YouTubeTranscriptApi\r\nytt_api_2 = YouTubeTranscriptApi(http_client=http_client)\r\n# now shares cookies with ytt_api\r\nytt_api_2.fetch(video_id)\r\n```\r\n\r\n## Cookie Authentication\r\n\r\nSome videos are age restricted, so this module won't be able to access those videos without some sort of\r\nauthentication. Unfortunately, some recent changes to the YouTube API have broken the current implementation of cookie \r\nbased authentication, so this feature is currently not available.\r\n\r\n## Using Formatters\r\nFormatters are meant to be an additional layer of processing of the transcript you pass it. The goal is to convert a\r\n`FetchedTranscript` object into a consistent string of a given \"format\". Such as a basic text (`.txt`) or even formats \r\nthat have a defined specification such as JSON (`.json`), WebVTT (`.vtt`), SRT (`.srt`), Comma-separated format \r\n(`.csv`), etc...\r\n\r\nThe `formatters` submodule provides a few basic formatters, which can be used as is, or extended to your needs:\r\n\r\n- JSONFormatter\r\n- PrettyPrintFormatter\r\n- TextFormatter\r\n- WebVTTFormatter\r\n- SRTFormatter\r\n\r\nHere is how to import from the `formatters` module.\r\n\r\n```python\r\n# the base class to inherit from when creating your own formatter.\r\nfrom youtube_transcript_api.formatters import Formatter\r\n\r\n# some provided subclasses, each outputs a different string format.\r\nfrom youtube_transcript_api.formatters import JSONFormatter\r\nfrom youtube_transcript_api.formatters import TextFormatter\r\nfrom youtube_transcript_api.formatters import WebVTTFormatter\r\nfrom youtube_transcript_api.formatters import SRTFormatter\r\n```\r\n\r\n### Formatter Example\r\nLet's say we wanted to retrieve a transcript and store it to a JSON file. That would look something like this:\r\n\r\n```python\r\n# your_custom_script.py\r\n\r\nfrom youtube_transcript_api import YouTubeTranscriptApi\r\nfrom youtube_transcript_api.formatters import JSONFormatter\r\n\r\nytt_api = YouTubeTranscriptApi()\r\ntranscript = ytt_api.fetch(video_id)\r\n\r\nformatter = JSONFormatter()\r\n\r\n# .format_transcript(transcript) turns the transcript into a JSON string.\r\njson_formatted = formatter.format_transcript(transcript)\r\n\r\n# Now we can write it out to a file.\r\nwith open('your_filename.json', 'w', encoding='utf-8') as json_file:\r\n    json_file.write(json_formatted)\r\n\r\n# Now should have a new JSON file that you can easily read back into Python.\r\n```\r\n\r\n**Passing extra keyword arguments**\r\n\r\nSince JSONFormatter leverages `json.dumps()` you can also forward keyword arguments into \r\n`.format_transcript(transcript)` such as making your file output prettier by forwarding the `indent=2` keyword argument.\r\n\r\n```python\r\njson_formatted = JSONFormatter().format_transcript(transcript, indent=2)\r\n```\r\n\r\n### Custom Formatter Example\r\nYou can implement your own formatter class. Just inherit from the `Formatter` base class and ensure you implement the \r\n`format_transcript(self, transcript: FetchedTranscript, **kwargs) -\u003e str` and \r\n`format_transcripts(self, transcripts: List[FetchedTranscript], **kwargs) -\u003e str` methods which should ultimately \r\nreturn a string when called on your formatter instance.\r\n\r\n```python\r\nclass MyCustomFormatter(Formatter):\r\n    def format_transcript(self, transcript: FetchedTranscript, **kwargs) -\u003e str:\r\n        # Do your custom work in here, but return a string.\r\n        return 'your processed output data as a string.'\r\n\r\n    def format_transcripts(self, transcripts: List[FetchedTranscript], **kwargs) -\u003e str:\r\n        # Do your custom work in here to format a list of transcripts, but return a string.\r\n        return 'your processed output data as a string.'\r\n```\r\n\r\n## CLI\r\n\r\nExecute the CLI script using the video ids as parameters and the results will be printed out to the command line:  \r\n\r\n```  \r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ...  \r\n```  \r\n\r\nThe CLI also gives you the option to provide a list of preferred languages:  \r\n\r\n```  \r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ... --languages de en  \r\n```\r\n\r\nYou can also specify if you want to exclude automatically generated or manually created subtitles:\r\n\r\n```  \r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ... --languages de en --exclude-generated\r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ... --languages de en --exclude-manually-created\r\n```\r\n\r\nIf you would prefer to write it into a file or pipe it into another application, you can also output the results as \r\njson using the following line:  \r\n\r\n```  \r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ... --languages de en --format json \u003e transcripts.json\r\n```  \r\n\r\nTranslating transcripts using the CLI is also possible:\r\n\r\n```  \r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e ... --languages en --translate de\r\n```  \r\n\r\nIf you are not sure which languages are available for a given video you can call, to list all available transcripts:\r\n\r\n```  \r\nyoutube_transcript_api --list-transcripts \u003cfirst_video_id\u003e\r\n```\r\n\r\nIf a video's ID starts with a hyphen you'll have to mask the hyphen using `\\` to prevent the CLI from mistaking it for \r\na argument name. For example to get the transcript for the video with the ID `-abc123` run:\r\n\r\n```\r\nyoutube_transcript_api \"\\-abc123\"\r\n```\r\n\r\n### Working around IP bans using the CLI\r\n\r\nIf you are running into `RequestBlocked` or `IpBlocked` errors, because YouTube blocks your IP, you can work around this \r\nusing residential proxies as explained in \r\n[Working around IP bans](#working-around-ip-bans-requestblocked-or-ipblocked-exception). To use\r\n[Webshare \"Residential\" proxies](https://www.webshare.io/?referral_code=w0xno53eb50g) through the CLI, you will have to \r\ncreate a [Webshare account](https://www.webshare.io/?referral_code=w0xno53eb50g) and purchase a \"Residential\" proxy \r\npackage that suits your workload (make sure NOT to purchase \"Proxy Server\" or \"Static Residential\"!). Then you can use \r\nthe \"Proxy Username\" and \"Proxy Password\" which you can find in your \r\n[Webshare Proxy Settings](https://dashboard.webshare.io/proxy/settings?referral_code=w0xno53eb50g), to run the following command:\r\n\r\n```\r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e --webshare-proxy-username \"username\" --webshare-proxy-password \"password\"\r\n```\r\n\r\nIf you prefer to use another proxy solution, you can set up a generic HTTP/HTTPS proxy using the following command:\r\n\r\n```\r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e --http-proxy http://user:pass@domain:port --https-proxy https://user:pass@domain:port\r\n```\r\n\r\n### Cookie Authentication using the CLI\r\n\r\nTo authenticate using cookies through the CLI as explained in [Cookie Authentication](#cookie-authentication) run:\r\n\r\n```\r\nyoutube_transcript_api \u003cfirst_video_id\u003e \u003csecond_video_id\u003e --cookies /path/to/your/cookies.txt\r\n```\r\n\r\n## Warning  \r\n\r\nThis code uses an undocumented part of the YouTube API, which is called by the YouTube web-client. So there is no \r\nguarantee that it won't stop working tomorrow, if they change how things work. I will however do my best to make things \r\nworking again as soon as possible if that happens. So if it stops working, let me know!  \r\n\r\n## Contributing\r\n\r\nTo setup the project locally run the following (requires [poetry](https://python-poetry.org/docs/) to be installed):\r\n```shell\r\npoetry install --with test,dev\r\n```\r\n\r\nThere's [poe](https://github.com/nat-n/poethepoet?tab=readme-ov-file#quick-start) tasks to run tests, coverage, the \r\nlinter and formatter (you'll need to pass all of those for the build to pass):\r\n```shell\r\npoe test\r\npoe coverage\r\npoe format\r\npoe lint\r\n```\r\n\r\nIf you just want to make sure that your code passes all the necessary checks to get a green build, you can simply run:\r\n```shell\r\npoe precommit\r\n```\r\n\r\n## Donations\r\n\r\nIf this project makes you happy by reducing your development time, you can make me happy by treating me to a cup of \r\ncoffee, or become a [Sponsor of this project](https://github.com/sponsors/jdepoix) :)  \r\n\r\n[![Donate](https://www.paypalobjects.com/en_US/i/btn/btn_donateCC_LG.gif)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=BAENLEW8VUJ6G\u0026source=url)\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjdepoix%2Fyoutube-transcript-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjdepoix%2Fyoutube-transcript-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjdepoix%2Fyoutube-transcript-api/lists"}