{"id":14968093,"url":"https://github.com/aarnphm/whispercpp","last_synced_at":"2025-05-15T05:02:22.218Z","repository":{"id":90679417,"uuid":"605938548","full_name":"aarnphm/whispercpp","owner":"aarnphm","description":"Pybind11 bindings for Whisper.cpp","archived":false,"fork":false,"pushed_at":"2024-12-08T00:20:10.000Z","size":2202,"stargazers_count":331,"open_issues_count":34,"forks_count":66,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-07T16:07:52.800Z","etag":null,"topics":["audio-transcription","bazel","bentoml","mlops-workflow","nix","pybind11","python3","whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aarnphm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-24T08:24:12.000Z","updated_at":"2025-05-07T02:40:06.000Z","dependencies_parsed_at":"2024-02-29T01:26:59.416Z","dependency_job_id":"9df85547-8588-442c-ad56-21ac316b3ea3","html_url":"https://github.com/aarnphm/whispercpp","commit_stats":{"total_commits":384,"total_committers":12,"mean_commits":32.0,"dds":0.3046875,"last_synced_commit":"c27991d371bd17805ec72bca609a6fe466556ba9"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":"aarnphm/bazix","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aarnphm%2Fwhispercpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aarnphm%2Fwhispercpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aarnphm%2Fwhispercpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aarnphm%2Fwhispercpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aarnphm","download_url":"https://codeload.github.com/aarnphm/whispercpp/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254276444,"owners_count":22043866,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio-transcription","bazel","bentoml","mlops-workflow","nix","pybind11","python3","whisper","whisper-cpp"],"created_at":"2024-09-24T13:39:16.931Z","updated_at":"2025-05-15T05:02:22.192Z","avatar_url":"https://github.com/aarnphm.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"# whispercpp [![CI](https://github.com/aarnphm/whispercpp/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/aarnphm/whispercpp/actions/workflows/ci.yml)\n\n_Pybind11 bindings for\n[whisper.cpp](https://github.com/ggerganov/whisper.cpp.git)_\n\n## Quickstart\n\nInstall with pip:\n\n```bash\npip install whispercpp\n```\n\n\u003e NOTE: We will setup a hermetic toolchain for all platforms that doesn't have a\n\u003e prebuilt wheels, (which means you don't have to setup anything to install the\n\u003e Python package) which will take a bit longer to install. Pass `-vv` to `pip`\n\u003e to see the progress.\n\nTo use the latest version, install from source:\n\n```bash\npip install git+https://github.com/aarnphm/whispercpp.git -vv\n```\n\nFor local setup, initialize all submodules:\n\n```bash\ngit submodule update --init --recursive\n```\n\nBuild the wheel:\n\n```bash\n# Option 1: using pypa/build\npython3 -m build -w\n\n# Option 2: using bazel\n./tools/bazel build //:whispercpp_wheel\n```\n\nInstall the wheel:\n\n```bash\n# Option 1: via pypa/build\npip install dist/*.whl\n\n# Option 2: using bazel\npip install $(./tools/bazel info bazel-bin)/*.whl\n```\n\nThe binding provides a `Whisper` class:\n\n```python\nfrom whispercpp import Whisper\n\nw = Whisper.from_pretrained(\"tiny.en\")\n```\n\nCurrently, the inference API is provided via `transcribe`:\n\n```python\nw.transcribe(np.ones((1, 16000)))\n```\n\nYou can use any of your favorite audio libraries\n([ffmpeg](https://github.com/kkroening/ffmpeg-python) or\n[librosa](https://librosa.org/doc/main/index.html), or\n`whispercpp.api.load_wav_file`) to load audio files into a Numpy array, then\npass it to `transcribe`:\n\n```python\nimport ffmpeg\nimport numpy as np\n\ntry:\n    y, _ = (\n        ffmpeg.input(\"/path/to/audio.wav\", threads=0)\n        .output(\"-\", format=\"s16le\", acodec=\"pcm_s16le\", ac=1, ar=sample_rate)\n        .run(\n            cmd=[\"ffmpeg\", \"-nostdin\"], capture_stdout=True, capture_stderr=True\n        )\n    )\nexcept ffmpeg.Error as e:\n    raise RuntimeError(f\"Failed to load audio: {e.stderr.decode()}\") from e\n\narr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0\n\nw.transcribe(arr)\n```\n\nYou can also use the model `transcribe_from_file` for convience:\n\n```python\nw.transcribe_from_file(\"/path/to/audio.wav\")\n```\n\nThe Pybind11 bindings supports all of the features from whisper.cpp, that takes\ninspiration from [whisper-rs](https://github.com/tazz4843/whisper-rs)\n\nThe binding can also be used via `api`:\n\n```python\nfrom whispercpp import api\n\n# Binding directly fromn whisper.cpp\n```\n\n## Development\n\nSee [DEVELOPMENT.md](./DEVELOPMENT.md)\n\n## APIs\n\n### `Whisper`\n\n1. `Whisper.from_pretrained(model_name: str) -\u003e Whisper`\n\n   Load a pre-trained model from the local cache or download and cache if\n   needed. Supports loading a custom ggml model from a local path passed as `model_name`.\n\n   ```python\n   w = Whisper.from_pretrained(\"tiny.en\")\n   w = Whisper.from_pretrained(\"/path/to/model.bin\")\n   ```\n\n   The model will be saved to `$XDG_DATA_HOME/whispercpp` or\n   `~/.local/share/whispercpp` if the environment variable is not set.\n\n2. `Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)`\n\n   Running transcription on a given Numpy array. This calls `full` from\n   `whisper.cpp`. If `num_proc` is greater than 1, it will use `full_parallel`\n   instead.\n\n   ```python\n   w.transcribe(np.ones((1, 16000)))\n   ```\n\n   To transcribe from a WAV file use `transcribe_from_file`:\n\n   ```python\n   w.transcribe_from_file(\"/path/to/audio.wav\")\n   ```\n\n3. `Whisper.stream_transcribe(*, length_ms: int=..., device_id: int=..., num_proc: int=...) -\u003e Iterator[str]`\n\n   [EXPERIMENTAL] Streaming transcription. This calls `stream_` from\n   `whisper.cpp`. The transcription will be yielded as soon as it's available.\n   See [stream.py](./examples/stream/stream.py) for an example.\n\n   \u003e Note: The `device_id` is the index of the audio device. You can use\n   \u003e `whispercpp.api.available_audio_devices` to get the list of available audio\n   \u003e devices.\n\n### `api`\n\n`api` is a direct binding from `whisper.cpp`, that has similar API to\n`whisper-rs`.\n\n1. `api.Context`\n\n   This class is a wrapper around `whisper_context`\n\n   ```python\n   from whispercpp import api\n\n   ctx = api.Context.from_file(\"/path/to/saved_weight.bin\")\n   ```\n\n   \u003e Note: The context can also be accessed from the `Whisper` class via\n   \u003e `w.context`\n\n2. `api.Params`\n\n   This class is a wrapper around `whisper_params`\n\n   ```python\n   from whispercpp import api\n\n   params = api.Params()\n   ```\n\n   \u003e Note: The params can also be accessed from the `Whisper` class via\n   \u003e `w.params`\n\n## Why not?\n\n- [whispercpp.py](https://github.com/stlukey/whispercpp.py). There are a few key\n  differences here:\n\n  - They provides the Cython bindings. From the UX standpoint, this achieves the\n    same goal as `whispercpp`. The difference is `whispercpp` use Pybind11\n    instead. Feel free to use it if you prefer Cython over Pybind11. Note that\n    `whispercpp.py` and `whispercpp` are mutually exclusive, as they also use\n    the `whispercpp` namespace.\n  - `whispercpp` provides similar APIs as\n    [`whisper-rs`](https://github.com/tazz4843/whisper-rs), which provides a\n    nicer UX to work with. There are literally two APIs (`from_pretrained` and\n    `transcribe`) to quickly use whisper.cpp in Python.\n  - `whispercpp` doesn't pollute your `$HOME` directory, rather it follows the\n    [XDG Base Directory Specification](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html)\n    for saved weights.\n\n- Using `cdll` and `ctypes` and be done with it?\n\n  - This is also valid, but requires a lot of hacking and it is pretty slow\n    comparing to Cython and Pybind11.\n\n## Examples\n\nSee [examples](./examples) for more information\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faarnphm%2Fwhispercpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faarnphm%2Fwhispercpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faarnphm%2Fwhispercpp/lists"}