{"id":19329761,"url":"https://github.com/outerbounds/whisper-metaflow-k8s","last_synced_at":"2025-04-12T18:24:55.050Z","repository":{"id":154452978,"uuid":"578891032","full_name":"outerbounds/whisper-metaflow-k8s","owner":"outerbounds","description":null,"archived":false,"fork":false,"pushed_at":"2023-10-30T15:51:12.000Z","size":454,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-26T12:46:38.417Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/outerbounds.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-12-16T06:01:45.000Z","updated_at":"2024-10-27T04:20:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"5efe11bd-4ba9-4997-812f-6aa9fad2bef5","html_url":"https://github.com/outerbounds/whisper-metaflow-k8s","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fwhisper-metaflow-k8s","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fwhisper-metaflow-k8s/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fwhisper-metaflow-k8s/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/outerbounds%2Fwhisper-metaflow-k8s/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/outerbounds","download_url":"https://codeload.github.com/outerbounds/whisper-metaflow-k8s/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248611454,"owners_count":21133112,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T02:29:48.590Z","updated_at":"2025-04-12T18:24:55.028Z","avatar_url":"https://github.com/outerbounds.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OpenAI Whisper on Metaflow 👋\n\nThis repository will help you optimize Open AI's [Whisper](https://github.com/openai/whisper) in workflows run on the [Outerbounds Platform](https://outerbounds.com/blog/announcing-outerbounds-platform/). It builds on our earlier repository to help you [Get started with Whisper and Metaflow](https://github.com/outerbounds/whisper-metaflow). This implementation focuses on using Kubernetes resources to unlock new levels of scale and processing throughput. \n\n\u003cimg src=\"./static/Whisper-Cover.png\" width=600\u003e \u003c/img\u003e\n\n## Repository Overview\n\n| File           | Description |\n|------------------|-------------|\n| [Dockerfile](./Dockerfile)   | Dockerfile to create a docker image for running OpenAI Whisper |\n| [Makefile](./Makefile)   | Makefile for building the docker image |\n| [youtube\\_video\\_transcriber.py](./youtube_video_transcriber.py)   | CLI tool for creating a transcript of a given YouTube URL and given model |\n| [whisper_flow.py](./whisper_flow.py) | Metaflow flow for creating transcripts of using whispers tiny and large models |\n\n# Run the flow ▶️\n\n\u003cimg src=\"./static/MLSys-02.png\" width=600\u003e \u003c/img\u003e\n\n## Running with Kubernetes resources\nTo unleash the power of the cloud with [Metaflow's Kubernetes decorator](https://docs.metaflow.org/scaling/remote-tasks/kubernetes), run this command from your terminal. \n\nThis uses an already built Docker image ready for running this flow.\n\n```\n$ python3 whisper_flow.py run --with kubernetes:image=public.ecr.aws/outerbounds/whisper-metaflow:latest\n```\n\n# Customizing flow dependencies ⚙️\n\nThis section assumes you have Docker setup and running locally. If you don't have Docker installed, please follow the instructions [here](https://docs.docker.com/get-docker/). If there are other packages to be installed or changes to be made in existing ones, update the Dockerfile.\n\n## Create the docker image\nWith Docker running, build the image specified in the `./Dockerfile`. \n\n```\n$ make build\n...\n =\u003e =\u003e writing image sha256:23be1b523a3404d8bee8e4c8ac29f7160ac7ad7090d48c567010a34cb9f2666e                                                           0.0s\n =\u003e =\u003e naming to docker.io/library/whisper-metaflow                                                                                                    0.0s\n```\n\n## Tag and push the docker image to a repository.\nThen tag the resultant image and push it to an image registry. In this example, we are using GitLab's container registry.\n```\n$ docker tag sha256:23be1b523a3404d8bee8e4c8ac29f7160ac7ad7090d48c567010a34cb9f2666e whisper-metaflow:latest\n...\n\n$ docker push whisper-metaflow\n```\n\n## Run the flow with customized image\n\n```\n$ python3 whisper_flow.py run --with kubernetes:image=whisper-metaflow:latest\n```\n\n## Run the flow with customized image and changed CPU/Memory resources\n\n```\n$ python3 whisper_flow.py run --with kubernetes:image=whisper-metaflow:latest,cpu=4,memory=8192\n```\n\n## Alternate approach\nInstead of running the flow with cli options above, you could also change the whisper_flow.py file and add the `@kubernetes` decorator to appropriate steps and then simply run the flow as:\n\n```\n$ python3 whisper_flow.py run\n```\n  \n# Get Help 🤗\nPlease join us on [Slack](http://slack.outerbounds.co/) if you have questions about getting setup. The Metaflow community is responsive and happy to help!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fouterbounds%2Fwhisper-metaflow-k8s","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fouterbounds%2Fwhisper-metaflow-k8s","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fouterbounds%2Fwhisper-metaflow-k8s/lists"}