{"id":13523464,"url":"https://github.com/ArthurFDLR/whisper-youtube","last_synced_at":"2025-04-01T00:31:47.035Z","repository":{"id":61616853,"uuid":"544273829","full_name":"ArthurFDLR/whisper-youtube","owner":"ArthurFDLR","description":"🔉 Youtube Videos Transcription with OpenAI's Whisper","archived":false,"fork":false,"pushed_at":"2024-04-23T19:24:04.000Z","size":127,"stargazers_count":357,"open_issues_count":2,"forks_count":105,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-10-10T19:20:25.477Z","etag":null,"topics":["automatic-speech-recognition","colab-notebook","speech-recognition","speech-to-text","transformer","whisper","youtube"],"latest_commit_sha":null,"homepage":"https://colab.research.google.com/github/ArthurFDLR/whisper-youtube/blob/main/whisper_youtube.ipynb","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ArthurFDLR.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-10-02T04:14:24.000Z","updated_at":"2024-10-10T15:19:14.000Z","dependencies_parsed_at":"2024-04-12T04:44:35.350Z","dependency_job_id":"0116f2c0-ffb0-456f-b9ed-bcc525953e52","html_url":"https://github.com/ArthurFDLR/whisper-youtube","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArthurFDLR%2Fwhisper-youtube","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArthurFDLR%2Fwhisper-youtube/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArthurFDLR%2Fwhisper-youtube/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArthurFDLR%2Fwhisper-youtube/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ArthurFDLR","download_url":"https://codeload.github.com/ArthurFDLR/whisper-youtube/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222688173,"owners_count":17023297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automatic-speech-recognition","colab-notebook","speech-recognition","speech-to-text","transformer","whisper","youtube"],"created_at":"2024-08-01T06:01:00.375Z","updated_at":"2024-11-02T07:31:42.746Z","avatar_url":"https://github.com/ArthurFDLR.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","Applications","Repositories","Playgrounds","Youtube","By Use Case"],"sub_categories":["Self-hosted","Subtitles \u0026 Captioning"],"readme":"# **Youtube Videos Transcription with OpenAI's Whisper**\n\n[![blog post shield](https://img.shields.io/static/v1?label=\u0026message=Blog%20post\u0026color=blue\u0026style=for-the-badge\u0026logo=openai\u0026link=https://openai.com/blog/whisper)](https://openai.com/blog/whisper)\n[![notebook shield](https://img.shields.io/static/v1?label=\u0026message=Notebook\u0026color=blue\u0026style=for-the-badge\u0026logo=googlecolab\u0026link=https://colab.research.google.com/github/ArthurFDLR/whisper-youtube/blob/main/whisper_youtube.ipynb)](https://colab.research.google.com/github/ArthurFDLR/whisper-youtube/blob/main/whisper_youtube.ipynb)\n[![repository shield](https://img.shields.io/static/v1?label=\u0026message=Repository\u0026color=blue\u0026style=for-the-badge\u0026logo=github\u0026link=https://github.com/openai/whisper)](https://github.com/openai/whisper)\n[![paper shield](https://img.shields.io/static/v1?label=\u0026message=Paper\u0026color=blue\u0026style=for-the-badge\u0026link=https://cdn.openai.com/papers/whisper.pdf)](https://cdn.openai.com/papers/whisper.pdf)\n[![model card shield](https://img.shields.io/static/v1?label=\u0026message=Model%20card\u0026color=blue\u0026style=for-the-badge\u0026link=https://github.com/openai/whisper/blob/main/model-card.md)](https://github.com/openai/whisper/blob/main/model-card.md)\n\nWhisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.\n\nThis notebook will guide you through the transcription of a Youtube video using Whisper. You'll be able to explore most inference parameters or use the Notebook as-is to store the transcript and the audio of the video in your Google Drive.\n\n\n# **Check GPU type** 🕵️\n\nThe type of GPU you get assigned in your Colab session defined the speed at which the video will be transcribed.\nThe higher the number of floating point operations per second (FLOPS), the faster the transcription.\nBut even the least powerful GPU available in Colab is able to run any Whisper model.\nMake sure you've selected `GPU` as hardware accelerator for the Notebook (Runtime \u0026rarr; Change runtime type \u0026rarr; Hardware accelerator).\n\n|  GPU   |  GPU RAM   | FP32 teraFLOPS |     Availability   |\n|:------:|:----------:|:--------------:|:------------------:|\n|  T4    |    16 GB   |       8.1      |         Free       |\n| P100   |    16 GB   |      10.6      |      Colab Pro     |\n| V100   |    16 GB   |      15.7      |  Colab Pro (Rare)  |\n\n---\n**Factory reset your Notebook's runtime if you want to get assigned a new GPU.**\n\n\n```\n    GPU 0: Tesla T4 (UUID: GPU-9ba4ce04-e020-44f9-8fc3-337ba5bb5496)\n    Sun Oct  2 16:49:51 2022       \n    +-----------------------------------------------------------------------------+\n    | NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |\n    |-------------------------------+----------------------+----------------------+\n    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |\n    |                               |                      |               MIG M. |\n    |===============================+======================+======================|\n    |   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |\n    | N/A   36C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |\n    |                               |                      |                  N/A |\n    +-------------------------------+----------------------+----------------------+\n                                                                                   \n    +-----------------------------------------------------------------------------+\n    | Processes:                                                                  |\n    |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |\n    |        ID   ID                                                   Usage      |\n    |=============================================================================|\n    |  No running processes found                                                 |\n    +-----------------------------------------------------------------------------+\n```\n\n\n# **Install libraries** 🏗️\nThis cell will take a little while to download several libraries, including Whisper.\n\n---\n\n```\n    Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n    Collecting git+https://github.com/openai/whisper.git\n      Cloning https://github.com/openai/whisper.git to /tmp/pip-req-build-c3voj3wy\n      Running command git clone -q https://github.com/openai/whisper.git /tmp/pip-req-build-c3voj3wy\n    Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (1.21.6)\n    Requirement already satisfied: torch in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (1.12.1+cu113)\n    Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (4.64.1)\n    Requirement already satisfied: more-itertools in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (8.14.0)\n    Requirement already satisfied: transformers\u003e=4.19.0 in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (4.22.2)\n    Requirement already satisfied: ffmpeg-python==0.2.0 in /usr/local/lib/python3.7/dist-packages (from whisper==1.0) (0.2.0)\n    Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from ffmpeg-python==0.2.0-\u003ewhisper==1.0) (0.16.0)\n    Requirement already satisfied: huggingface-hub\u003c1.0,\u003e=0.9.0 in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (0.10.0)\n    Requirement already satisfied: pyyaml\u003e=5.1 in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (6.0)\n    Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (4.12.0)\n    Requirement already satisfied: tokenizers!=0.11.3,\u003c0.13,\u003e=0.11.1 in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (0.12.1)\n    Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (2.23.0)\n    Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (2022.6.2)\n    Requirement already satisfied: packaging\u003e=20.0 in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (21.3)\n    Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from transformers\u003e=4.19.0-\u003ewhisper==1.0) (3.8.0)\n    Requirement already satisfied: typing-extensions\u003e=3.7.4.3 in /usr/local/lib/python3.7/dist-packages (from huggingface-hub\u003c1.0,\u003e=0.9.0-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (4.1.1)\n    Requirement already satisfied: pyparsing!=3.0.5,\u003e=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging\u003e=20.0-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (3.0.9)\n    Requirement already satisfied: zipp\u003e=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (3.8.1)\n    Requirement already satisfied: certifi\u003e=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (2022.6.15)\n    Requirement already satisfied: idna\u003c3,\u003e=2.5 in /usr/local/lib/python3.7/dist-packages (from requests-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (2.10)\n    Requirement already satisfied: chardet\u003c4,\u003e=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (3.0.4)\n    Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,\u003c1.26,\u003e=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests-\u003etransformers\u003e=4.19.0-\u003ewhisper==1.0) (1.24.3)\n    Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n    Requirement already satisfied: pytube in /usr/local/lib/python3.7/dist-packages (12.1.0)\n\n\n    Using device: cuda:0\n```\n\n# **Optional:** Save images in Google Drive 💾\nEnter a Google Drive path and run this cell if you want to store the results inside Google Drive.\n\n---\n\n```drive_path = \"Colab Notebooks/Whisper Youtube\"```\n\n---\n**Run this cell again if you change your Google Drive path.**\n\n\n\n# **Model selection** 🧠\n\nAs of the first public release, there are 4 pre-trained options to play with:\n\n|  Size  | Parameters | English-only model | Multilingual model | Required VRAM | Relative speed |\n|:------:|:----------:|:------------------:|:------------------:|:-------------:|:--------------:|\n|  tiny  |    39 M    |     `tiny.en`      |       `tiny`       |     ~1 GB     |      ~32x      |\n|  base  |    74 M    |     `base.en`      |       `base`       |     ~1 GB     |      ~16x      |\n| small  |   244 M    |     `small.en`     |      `small`       |     ~2 GB     |      ~6x       |\n| medium |   769 M    |    `medium.en`     |      `medium`      |     ~5 GB     |      ~2x       |\n| large  |   1550 M   |        N/A         |      `large`       |    ~10 GB     |       1x       |\n\n---\n\n```Model = 'large'```\n\n---\n**Run this cell again if you change the model.**\n\n```\n    100%|█████████████████████████████████████| 2.87G/2.87G [01:14\u003c00:00, 41.5MiB/s]\n```\n\n**large model is selected.**\n\n\n\n# **Video selection** 📺\n\nEnter the URL of the Youtube video you want to transcribe, whether you want to save the audio file in your Google Drive, and run the cell.\n\n---\n\n```URL = \"https://youtu.be/dQw4w9WgXcQ\"```\n\n```store_audio = True```\n\n---\n**Run this cell again if you change the video.**\n\n# **Run the model** 🚀\n\nRun this cell to execute the transcription of the video. This can take a while and is very much based on the length of the video and the number of parameters of the model selected above.\n\n---\n\n```Language = \"English\"```\n\n```Output_type = '.vtt'```\n\n---\n\n```\n    [00:00.000 --\u003e 00:22.000]  We're no strangers to love.\n    [00:22.000 --\u003e 00:27.000]  You know the rules, and so do I.\n    [00:27.000 --\u003e 00:31.000]  Our full commitments while I'm thinking of.\n    [00:31.000 --\u003e 00:35.000]  You wouldn't get this from any other guy.\n    [00:35.000 --\u003e 00:40.000]  I just wanna tell you how I'm feeling.\n    [00:40.000 --\u003e 00:43.000]  Gotta make you understand.\n    [00:43.000 --\u003e 00:45.000]  Never gonna give you up.\n    [00:45.000 --\u003e 00:47.000]  Never gonna let you down.\n    [00:47.000 --\u003e 00:51.000]  Never gonna run around and desert you.\n    [00:51.000 --\u003e 00:53.000]  Never gonna make you cry.\n    [00:53.000 --\u003e 00:55.000]  Never gonna say goodbye.\n    [00:55.000 --\u003e 01:00.000]  Never gonna tell a lie and hurt you.\n    [01:00.000 --\u003e 01:04.000]  We've known each other for so long.\n    [01:04.000 --\u003e 01:09.000]  Your heart's been aching, but you're too shy to say it.\n    [01:09.000 --\u003e 01:13.000]  Inside we both know what's been going on.\n    [01:13.000 --\u003e 01:17.000]  We know the game and we're gonna play it.\n    [01:17.000 --\u003e 01:22.000]  And if you ask me how I'm feeling.\n    [01:22.000 --\u003e 01:25.000]  Don't tell me you're too blind to see.\n    [01:25.000 --\u003e 01:27.000]  Never gonna give you up.\n    [01:27.000 --\u003e 01:29.000]  Never gonna let you down.\n    [01:29.000 --\u003e 01:33.000]  Never gonna run around and desert you.\n    [01:33.000 --\u003e 01:35.000]  Never gonna make you cry.\n    [01:35.000 --\u003e 01:38.000]  Never gonna say goodbye.\n    [01:38.000 --\u003e 01:41.000]  Never gonna tell a lie and hurt you.\n    [01:41.000 --\u003e 01:43.000]  Never gonna give you up.\n    [01:43.000 --\u003e 01:46.000]  Never gonna let you down.\n    [01:46.000 --\u003e 01:50.000]  Never gonna run around and desert you.\n    [01:50.000 --\u003e 01:59.000]  Never gonna make you cry, never gonna say goodbye, never gonna tell a lie and hurt you\n    [01:59.000 --\u003e 02:07.000]  Give you love, give you love\n    [02:07.000 --\u003e 02:16.000]  Never gonna give, never gonna give, give you love\n    [02:16.000 --\u003e 02:25.000]  We've known each other for so long, your heart's been aching but you're too shy to say it\n    [02:25.000 --\u003e 02:33.000]  Inside we both know what's been going on, we know the game and we're gonna play it\n    [02:33.000 --\u003e 02:41.000]  I just wanna tell you how I'm feeling, gotta make you understand\n    [02:41.000 --\u003e 02:49.000]  Never gonna give you up, never gonna let you down, never gonna run around and desert you\n    [02:49.000 --\u003e 02:57.000]  Never gonna make you cry, never gonna say goodbye, never gonna tell a lie and hurt you\n    [02:57.000 --\u003e 03:06.000]  Never gonna give you up, never gonna let you down, never gonna run around and desert you\n    [03:06.000 --\u003e 03:14.500]  Never gonna make you cry, never gonna say goodbye, never gonna tell a lie, and hurt you.\n    [03:14.500 --\u003e 03:23.000]  Never gonna give you up, never gonna let you down, never gonna run around and desert you.\n    [03:23.000 --\u003e 03:27.500]  We're gonna make you cry, we're gonna say goodbye,\n    [03:27.500 --\u003e 03:53.400]  we're gonna say goodbye.\n```\n\n**Transcript file created: /content/drive/My Drive/Colab Notebooks/Whisper Youtube/dQw4w9WgXcQ.vtt**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FArthurFDLR%2Fwhisper-youtube","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FArthurFDLR%2Fwhisper-youtube","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FArthurFDLR%2Fwhisper-youtube/lists"}