{"id":13585186,"url":"https://github.com/maxbbraun/whisper-edge","last_synced_at":"2025-04-07T06:32:51.011Z","repository":{"id":221768272,"uuid":"614351714","full_name":"maxbbraun/whisper-edge","owner":"maxbbraun","description":"OpenAI Whisper for edge devices","archived":false,"fork":false,"pushed_at":"2023-03-21T08:27:11.000Z","size":3703,"stargazers_count":115,"open_issues_count":7,"forks_count":19,"subscribers_count":9,"default_branch":"main","last_synced_at":"2024-11-06T02:43:55.684Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxbbraun.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-03-15T12:16:47.000Z","updated_at":"2024-11-05T15:18:18.000Z","dependencies_parsed_at":"2024-02-10T00:04:51.563Z","dependency_job_id":null,"html_url":"https://github.com/maxbbraun/whisper-edge","commit_stats":null,"previous_names":["maxbbraun/whisper-edge"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fwhisper-edge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fwhisper-edge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fwhisper-edge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fwhisper-edge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxbbraun","download_url":"https://codeload.github.com/maxbbraun/whisper-edge/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247607566,"owners_count":20965944,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T15:04:47.109Z","updated_at":"2025-04-07T06:32:51.004Z","avatar_url":"https://github.com/maxbbraun.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Whisper Edge\n\nPorting [OpenAI Whisper](https://github.com/openai/whisper) speech recognition to edge devices with hardware ML accelerators, enabling always-on live voice transcription. Current work includes [Jetson Nano](#jetson-nano) and [Coral Edge TPU](#coral-edge-tpu).\n\n## Jetson Nano\n\n![Jetson Nano](media/jetson-nano.jpg)\n\n### Shopping cart\n\n| Part | Price (2023) |\n| :- | -: |\n| [NVIDIA Jetson Nano Developer Kit (4G)](https://developer.nvidia.com/embedded/jetson-nano-developer-kit) | [$149.00](https://www.amazon.com/NVIDIA-Jetson-Nano-Developer-945-13450-0000-100/dp/B084DSDDLT/) |\n| [ChanGeek CGS-M1 USB Microphone](https://www.amazon.com/gp/product/B08M37224H/ref=ppx_yo_dt_b_asin_title_o03_s00) | [$16.99](https://www.amazon.com/gp/product/B08M37224H/ref=ppx_yo_dt_b_asin_title_o03_s00) |\n| [Noctua NF-A4x10 5V Fan](https://noctua.at/en/products/fan/nf-a4x10-5v) (or similar, recommended) | [$13.95](https://www.amazon.com/Noctua-Cooling-Bearing-NF-A4X10-FLX-5V/dp/B00NEMGCIA/) |\n| [D-Link DWA-181 Wi-Fi Adapter](https://www.dlink.com/en/products/dwa-181-ac1300-mu-mimo-wi-fi-nano-usb-adapter) (or similar, optional) | [$21.94](https://www.amazon.com/D-Link-Wireless-Internet-Supported-DWA-181-US/dp/B07YYL3RYJ/) |\n\n### Model\n\nThe [`base.en` version](https://github.com/openai/whisper#available-models-and-languages) of Whisper seems to work best for the Jetson Nano:\n - `base` is the largest model size that fits into the 4GB of memory without modification.\n - Inference performance with `base` is ~10x real-time in isolation and ~1x real-time while recording concurrently.\n - Using the english-only `.en` version further improves WER ([\u003c5% on LibriSpeech test-clean](https://cdn.openai.com/papers/whisper.pdf)).\n\n### Hack\n\nDilemma:\n - Whisper and some of its dependencies require Python 3.8.\n - The latest supported version of [JetPack](https://developer.nvidia.com/embedded/jetpack) for Jetson Nano is [4.6.3](https://developer.nvidia.com/jetpack-sdk-463), which is on Python 3.6.\n - [No easy way](https://github.com/maxbbraun/whisper-edge/issues/2) to update Python to 3.8 without losing CUDA support for PyTorch.\n\nWorkaround:\n - Fork [whisper](https://github.com/maxbbraun/whisper) and [tiktoken](https://github.com/maxbbraun/tiktoken), downgrading them to Python 3.6.\n\n### Setup\n\nFirst, follow the [developer kit setup instructions](https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit), connect the Wi-Fi adapter and the microphone to USB, and ideally [install a fan](https://noctua.at/en/nf-a4x10-flx/service). (Also plugging in an Ethernet cable helps to make the downloads faster.) Then, get a shell on the Jetson Nano:\n\n```bash\nssh user@jetson-nano.local\n```\n\nWe will use [NVIDIA Docker containers](https://hub.docker.com/r/dustynv/jetson-inference/tags) to run inference. Get the source code and build the custom container:\n\n```bash\ngit clone https://github.com/maxbbraun/whisper-edge.git\nbash whisper-edge/build.sh\n```\n\n### Run\n\nLaunch inference:\n\n```bash\nbash whisper-edge/run.sh\n```\n\nYou should see console output similar to this:\n\n```bash\nI0317 00:42:23.979984 547488051216 stream.py:75] Loading model \"base.en\"...\n100%|#######################################| 139M/139M [00:30\u003c00:00, 4.71MiB/s]\nI0317 00:43:14.232425 547488051216 stream.py:79] Warming model up...\nI0317 00:43:55.164070 547488051216 stream.py:86] Starting stream...\nI0317 00:44:19.775566 547488051216 stream.py:51]\nI0317 00:44:22.046195 547488051216 stream.py:51] Open AI's mission is to ensure that artificial general intelligence\nI0317 00:44:31.353919 547488051216 stream.py:51] benefits all of humanity.\nI0317 00:44:49.219501 547488051216 stream.py:51]\n```\n\nThe [`stream.py` script](stream.py) run in the container accepts flags for different configurations:\n\n```bash\nbash whisper-edge/run.sh --help\n\n       USAGE: stream.py [flags]\nflags:\n\nstream.py:\n  --channel_index: The index of the channel to use for transcription.\n    (default: '0')\n    (an integer)\n  --chunk_seconds: The length in seconds of each recorded chunk of audio.\n    (default: '10')\n    (an integer)\n  --input_device: The input device used to record audio.\n    (default: 'plughw:2,0')\n  --language: The language to use or empty to auto-detect.\n    (default: 'en')\n  --latency: The latency of the recording stream.\n    (default: 'low')\n  --model_name: The version of the OpenAI Whisper model to use.\n    (default: 'base.en')\n  --num_channels: The number of channels of the recorded audio.\n    (default: '1')\n    (an integer)\n  --sample_rate: The sample rate of the recorded audio.\n    (default: '16000')\n    (an integer)\n\nTry --helpfull to get a list of all flags.\n```\n\n### Troubleshooting\n\nTo see if the microphone is working properly, use [`alsa-utils`](https://github.com/alsa-project/alsa-utils):\n\n```bash\nsudo apt-get -y install alsa-utils\n\n# Is the USB device connected?\nlsusb\n\n# Is the correct recording device selected?\narecord -l\n\n# Is the gain set properly?\nalsamixer\n\n# Does a test recording work?\narecord --format=S16_LE --duration=5 --rate=16000 --channels=1 --device=plughw:2,0 test.wav\n```\n\n## Coral Edge TPU\n\n![Coral](media/coral.jpg)\n\nSee the corresponding [issue](https://github.com/maxbbraun/whisper-edge/issues/1) about what supporting the [Google Coral Edge TPU](https://coral.ai/products/) may look like.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxbbraun%2Fwhisper-edge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxbbraun%2Fwhisper-edge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxbbraun%2Fwhisper-edge/lists"}