{"id":23363847,"url":"https://github.com/absadiki/pywhispercpp","last_synced_at":"2025-12-14T13:35:10.624Z","repository":{"id":137982945,"uuid":"612023577","full_name":"absadiki/pywhispercpp","owner":"absadiki","description":"Python bindings for whisper.cpp","archived":false,"fork":false,"pushed_at":"2024-12-11T22:15:56.000Z","size":1227,"stargazers_count":191,"open_issues_count":26,"forks_count":26,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-12-17T16:25:58.731Z","etag":null,"topics":["openai-whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":"https://absadiki.github.io/pywhispercpp/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/absadiki.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-10T03:04:01.000Z","updated_at":"2024-12-16T19:53:26.000Z","dependencies_parsed_at":"2024-06-01T08:06:24.054Z","dependency_job_id":"388333ba-88ea-4400-9d4c-a0db780cf4d9","html_url":"https://github.com/absadiki/pywhispercpp","commit_stats":{"total_commits":114,"total_committers":9,"mean_commits":"12.666666666666666","dds":0.2543859649122807,"last_synced_commit":"266b2368aa2822ba7f11cfc70d3f3b6085f27279"},"previous_names":["absadiki/pywhispercpp","abdeladim-s/pywhispercpp"],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/absadiki%2Fpywhispercpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/absadiki%2Fpywhispercpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/absadiki%2Fpywhispercpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/absadiki%2Fpywhispercpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/absadiki","download_url":"https://codeload.github.com/absadiki/pywhispercpp/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247217184,"owners_count":20903009,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["openai-whisper","whisper-cpp"],"created_at":"2024-12-21T13:04:55.599Z","updated_at":"2025-12-14T13:35:10.535Z","avatar_url":"https://github.com/absadiki.png","language":"C++","readme":"# pywhispercpp\nPython bindings for [whisper.cpp](https://github.com/ggerganov/whisper.cpp) with a simple Pythonic API on top of it.\n\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![Wheels](https://github.com/absadiki/pywhispercpp/actions/workflows/wheels.yml/badge.svg?branch=main\u0026event=push)](https://github.com/absadiki/pywhispercpp/actions/workflows/wheels.yml)\n[![PyPi version](https://badgen.net/pypi/v/pywhispercpp)](https://pypi.org/project/pywhispercpp/)\n[![Downloads](https://static.pepy.tech/badge/pywhispercpp)](https://pepy.tech/project/pywhispercpp)\n\n# Table of contents\n\u003c!-- TOC --\u003e\n* [Installation](#installation)\n    * [From source](#from-source)\n    * [Pre-built wheels](#pre-built-wheels)\n    * [NVIDIA GPU support](#nvidia-gpu-support)\n    * [CoreML support](#coreml-support)\n    * [Vulkan support](#vulkan-support)\n* [Quick start](#quick-start)\n* [Examples](#examples)\n  * [CLI](#cli)\n  * [Assistant](#assistant)\n* [Advanced usage](#advanced-usage)\n* [Discussions and contributions](#discussions-and-contributions)\n* [License](#license)\n\u003c!-- TOC --\u003e\n\n# Installation\n\n### From source\n* For the best performance, you need to install the package from source:\n```shell\npip install git+https://github.com/absadiki/pywhispercpp\n```\n### Pre-built wheels\n* Otherwise, Basic Pre-built CPU wheels are available on PYPI\n\n```shell\npip install pywhispercpp # or pywhispercpp[examples] to install the extra dependencies needed for the examples\n```\n\n[Optional] To transcribe files other than wav, you need to install ffmpeg:  \n```shell\n# on Ubuntu or Debian\nsudo apt update \u0026\u0026 sudo apt install ffmpeg\n\n# on Arch Linux\nsudo pacman -S ffmpeg\n\n# on MacOS using Homebrew (https://brew.sh/)\nbrew install ffmpeg\n\n# on Windows using Chocolatey (https://chocolatey.org/)\nchoco install ffmpeg\n\n# on Windows using Scoop (https://scoop.sh/)\nscoop install ffmpeg\n```\n\n### NVIDIA GPU support\nTo Install the package with CUDA support, make sure you have [cuda](https://developer.nvidia.com/cuda-downloads) installed and use `GGML_CUDA=1`:\n\n```shell\nGGML_CUDA=1 pip install git+https://github.com/absadiki/pywhispercpp\n```\n### CoreML support\n\nInstall the package with `WHISPER_COREML=1`:\n\n```shell\nWHISPER_COREML=1 pip install git+https://github.com/absadiki/pywhispercpp\n```\n\n### Vulkan support\n\nInstall the package with `GGML_VULKAN=1`:\n\n```shell\nGGML_VULKAN=1 pip install git+https://github.com/absadiki/pywhispercpp\n```\n\n### OpenBLAS support\n\nIf OpenBLAS is installed, you can use `GGML_BLAS=1`. The other flags ensure you're installing fresh with the correct flags, and printing output for sanity checking.\n```shell\nGGML_BLAS=1 pip install git+https://github.com/absadiki/pywhispercpp --no-cache --force-reinstall -v\n```\n\n### OpenVINO support\n\nFollow the the steps to download correct OpenVINO package (https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#openvino-support).\n\nThen init the OpenVINO environment and build.\n```\nsource ~/l_openvino_toolkit_ubuntu22_2023.0.0.10926.b4452d56304_x86_64/setupvars.sh \nWHISPER_OPENVINO=1 pip install git+https://github.com/absadiki/pywhispercpp --no-cache --force-reinstall\n```\n\nNote that the toolkit for Ubuntu22 works on Ubuntu24\n\n\n** __Feel free to update this list and submit a PR if you tested the package on other backends.__\n\n\n# Quick start\n\n```python\nfrom pywhispercpp.model import Model\n\nmodel = Model('base.en')\nsegments = model.transcribe('file.wav')\nfor segment in segments:\n    print(segment.text)\n```\n\nYou can also assign a custom `new_segment_callback`\n\n```python\nfrom pywhispercpp.model import Model\n\nmodel = Model('base.en', print_realtime=False, print_progress=False)\nsegments = model.transcribe('file.mp3', new_segment_callback=print)\n```\n\n\n* The model will be downloaded automatically, or you can use the path to a local model.\n* You can pass any `whisper.cpp` [parameter](https://absadiki.github.io/pywhispercpp/#pywhispercpp.constants.PARAMS_SCHEMA) as a keyword argument to the `Model` class or to the `transcribe` function.\n* Check the [Model](https://absadiki.github.io/pywhispercpp/#pywhispercpp.model.Model) class documentation for more details.\n\n# Examples\n\n## CLI\nJust a straightforward example Command Line Interface. \nYou can use it as follows:\n\n```shell\npwcpp file.wav -m base --output-srt --print_realtime true\n```\nRun ```pwcpp --help``` to get the help message\n\n```shell\nusage: pwcpp [-h] [-m MODEL] [--version] [--processors PROCESSORS] [-otxt] [-ovtt] [-osrt] [-ocsv] [--strategy STRATEGY]\n             [--n_threads N_THREADS] [--n_max_text_ctx N_MAX_TEXT_CTX] [--offset_ms OFFSET_MS] [--duration_ms DURATION_MS]\n             [--translate TRANSLATE] [--no_context NO_CONTEXT] [--single_segment SINGLE_SEGMENT] [--print_special PRINT_SPECIAL]\n             [--print_progress PRINT_PROGRESS] [--print_realtime PRINT_REALTIME] [--print_timestamps PRINT_TIMESTAMPS]\n             [--token_timestamps TOKEN_TIMESTAMPS] [--thold_pt THOLD_PT] [--thold_ptsum THOLD_PTSUM] [--max_len MAX_LEN]\n             [--split_on_word SPLIT_ON_WORD] [--max_tokens MAX_TOKENS] [--audio_ctx AUDIO_CTX]\n             [--prompt_tokens PROMPT_TOKENS] [--prompt_n_tokens PROMPT_N_TOKENS] [--language LANGUAGE] [--suppress_blank SUPPRESS_BLANK]\n             [--suppress_non_speech_tokens SUPPRESS_NON_SPEECH_TOKENS] [--temperature TEMPERATURE] [--max_initial_ts MAX_INITIAL_TS]\n             [--length_penalty LENGTH_PENALTY] [--temperature_inc TEMPERATURE_INC] [--entropy_thold ENTROPY_THOLD]\n             [--logprob_thold LOGPROB_THOLD] [--no_speech_thold NO_SPEECH_THOLD] [--greedy GREEDY] [--beam_search BEAM_SEARCH]\n             media_file [media_file ...]\n\npositional arguments:\n  media_file            The path of the media file or a list of filesseparated by space\n\noptions:\n  -h, --help            show this help message and exit\n  -m MODEL, --model MODEL\n                        Path to the `ggml` model, or just the model name\n  --version             show program's version number and exit\n  --processors PROCESSORS\n                        number of processors to use during computation\n  -otxt, --output-txt   output result in a text file\n  -ovtt, --output-vtt   output result in a vtt file\n  -osrt, --output-srt   output result in a srt file\n  -ocsv, --output-csv   output result in a CSV file\n  --strategy STRATEGY   Available sampling strategiesGreefyDecoder -\u003e 0BeamSearchDecoder -\u003e 1\n  --n_threads N_THREADS\n                        Number of threads to allocate for the inferencedefault to min(4, available hardware_concurrency)\n  --n_max_text_ctx N_MAX_TEXT_CTX\n                        max tokens to use from past text as prompt for the decoder\n  --offset_ms OFFSET_MS\n                        start offset in ms\n  --duration_ms DURATION_MS\n                        audio duration to process in ms\n  --translate TRANSLATE\n                        whether to translate the audio to English\n  --no_context NO_CONTEXT\n                        do not use past transcription (if any) as initial prompt for the decoder\n  --single_segment SINGLE_SEGMENT\n                        force single segment output (useful for streaming)\n  --print_special PRINT_SPECIAL\n                        print special tokens (e.g. \u003cSOT\u003e, \u003cEOT\u003e, \u003cBEG\u003e, etc.)\n  --print_progress PRINT_PROGRESS\n                        print progress information\n  --print_realtime PRINT_REALTIME\n                        print results from within whisper.cpp (avoid it, use callback instead)\n  --print_timestamps PRINT_TIMESTAMPS\n                        print timestamps for each text segment when printing realtime\n  --token_timestamps TOKEN_TIMESTAMPS\n                        enable token-level timestamps\n  --thold_pt THOLD_PT   timestamp token probability threshold (~0.01)\n  --thold_ptsum THOLD_PTSUM\n                        timestamp token sum probability threshold (~0.01)\n  --max_len MAX_LEN     max segment length in characters\n  --split_on_word SPLIT_ON_WORD\n                        split on word rather than on token (when used with max_len)\n  --max_tokens MAX_TOKENS\n                        max tokens per segment (0 = no limit)\n  --audio_ctx AUDIO_CTX\n                        overwrite the audio context size (0 = use default)\n  --prompt_tokens PROMPT_TOKENS\n                        tokens to provide to the whisper decoder as initial prompt\n  --prompt_n_tokens PROMPT_N_TOKENS\n                        tokens to provide to the whisper decoder as initial prompt\n  --language LANGUAGE   for auto-detection, set to None, \"\" or \"auto\"\n  --suppress_blank SUPPRESS_BLANK\n                        common decoding parameters\n  --suppress_non_speech_tokens SUPPRESS_NON_SPEECH_TOKENS\n                        common decoding parameters\n  --temperature TEMPERATURE\n                        initial decoding temperature\n  --max_initial_ts MAX_INITIAL_TS\n                        max_initial_ts\n  --length_penalty LENGTH_PENALTY\n                        length_penalty\n  --temperature_inc TEMPERATURE_INC\n                        temperature_inc\n  --entropy_thold ENTROPY_THOLD\n                        similar to OpenAI's \"compression_ratio_threshold\"\n  --logprob_thold LOGPROB_THOLD\n                        logprob_thold\n  --no_speech_thold NO_SPEECH_THOLD\n                        no_speech_thold\n  --greedy GREEDY       greedy\n  --beam_search BEAM_SEARCH\n                        beam_search\n\n```\n\n## Assistant\n\nThis is a simple example showcasing the use of `pywhispercpp` to create an assistant like example. \nThe idea is to use a Voice Activity Detector (VAD) to detect speech (in this example, we used webrtcvad), and when some speech is detected, we run the transcription. \nIt is inspired from the [whisper.cpp/examples/command](https://github.com/ggerganov/whisper.cpp/tree/master/examples/command) example.\n\nYou can check the source code [here](https://github.com/absadiki/pywhispercpp/blob/main/pywhispercpp/examples/assistant.py) \nor you can use the class directly to create your own assistant:\n\n\n```python\nfrom pywhispercpp.examples.assistant import Assistant\n\nmy_assistant = Assistant(commands_callback=print, n_threads=8)\nmy_assistant.start()\n```\nHere, we set the `commands_callback` to a simple print function, so the commands will just get printed on the screen.\n\nYou can also run this example from the command line.\n```shell\n$ pwcpp-assistant --help\n\nusage: pwcpp-assistant [-h] [-m MODEL] [-ind INPUT_DEVICE] [-st SILENCE_THRESHOLD] [-bd BLOCK_DURATION]\n\noptions:\n  -h, --help            show this help message and exit\n  -m MODEL, --model MODEL\n                        Whisper.cpp model, default to tiny.en\n  -ind INPUT_DEVICE, --input_device INPUT_DEVICE\n                        Id of The input device (aka microphone)\n  -st SILENCE_THRESHOLD, --silence_threshold SILENCE_THRESHOLD\n                        he duration of silence after which the inference will be running, default to 16\n  -bd BLOCK_DURATION, --block_duration BLOCK_DURATION\n                        minimum time audio updates in ms, default to 30\n```\n-------------\n\n* Check the [examples folder](https://github.com/absadiki/pywhispercpp/tree/main/pywhispercpp/examples) for more examples.\n\n# Advanced usage\n* First check the [API documentation](https://absadiki.github.io/pywhispercpp/) for more advanced usage.\n* If you are a more experienced user, you can access the exposed C-APIs directly from the binding module `_pywhispercpp`.\n\n```python\nimport _pywhispercpp as pwcpp\n\nctx = pwcpp.whisper_init_from_file('path/to/ggml/model')\n```\n\n# Discussions and contributions\nIf you find any bug, please open an [issue](https://github.com/absadiki/pywhispercpp/issues).\n\nIf you have any feedback, or you want to share how you are using this project, feel free to use the [Discussions](https://github.com/absadiki/pywhispercpp/discussions) and open a new topic.\n\n# License\n\nThis project is licensed under the same license as [whisper.cpp](https://github.com/ggerganov/whisper.cpp/blob/master/LICENSE) (MIT  [License](./LICENSE)).\n\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabsadiki%2Fpywhispercpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabsadiki%2Fpywhispercpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabsadiki%2Fpywhispercpp/lists"}