{"id":17963741,"url":"https://github.com/pablolion/whisper-note","last_synced_at":"2025-10-29T07:19:06.976Z","repository":{"id":199163535,"uuid":"702263260","full_name":"PabloLION/whisper-note","owner":"PabloLION","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-03T00:59:20.000Z","size":215,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-17T06:02:52.558Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PabloLION.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-09T00:47:18.000Z","updated_at":"2025-02-03T00:59:23.000Z","dependencies_parsed_at":null,"dependency_job_id":"601e10f3-c3c6-438b-a09e-576cb3e38178","html_url":"https://github.com/PabloLION/whisper-note","commit_stats":null,"previous_names":["pablolion/whisper-note"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/PabloLION/whisper-note","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PabloLION%2Fwhisper-note","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PabloLION%2Fwhisper-note/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PabloLION%2Fwhisper-note/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PabloLION%2Fwhisper-note/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PabloLION","download_url":"https://codeload.github.com/PabloLION/whisper-note/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PabloLION%2Fwhisper-note/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260301984,"owners_count":22988717,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-29T11:44:56.147Z","updated_at":"2025-10-29T07:19:01.931Z","avatar_url":"https://github.com/PabloLION.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Key Features\n\n- **Live Speech Recognition and Transcription:** Automatically transcribe spoken words in real time (cannot be turned off).\n- **Detailed Live Transcript:** View a live transcript of your conversation, including translations, and see the queue length.\n- **Select Input Language:** Choose your preferred input language.\n- **Optional Real-Time Translation:** Get instant translations of spoken content if needed.\n- **Choose Model Size:** Select the model size that suits your needs.\n- **Language-Specific Models:** Utilize models tailored for specific languages.\n- **Export Trimmed Recordings:** Easily save trimmed `.wav` files of your recordings without silent gaps.\n- **Export Transcript History:** Save your entire transcript history as an `.html` file.\n- **High-Quality Transcription:** Optionally receive high-quality transcription after the recording is completed.\n- **Upcoming Feature:** Stay tuned for an optional summary of the transcript generated with the help of ChatGPT.\n\n## Install\n\n### Mac with Apple Silicon\n\n```zsh\nbrew install portaudio # src/pyaudio/device_api.c:9:10: fatal error: 'portaudio.h' file not found\nbrew install ffmpeg\nbrew install mbedtls # /opt/homebrew/Cellar/mbedtls/3.4.1/lib/libmbedcrypto.13.dylib\n```\n\nAfter this, my mbedtls@3.4.1 gives only a `libmbedcrypto.14.dylib` but I renamed it manually:\n\n```zsh\n# for mbedtls@3.4.1\ncp /opt/homebrew/Cellar/mbedtls/3.4.1/lib/libmbedcrypto.14.dylib /opt/homebrew/Cellar/mbedtls/3.4.1/lib/libmbedcrypto.13.dylib\n# for mbedtls@3.5.0\ncp /opt/homebrew/Cellar/mbedtls/3.5.0/lib/libmbedcrypto.15.dylib /opt/homebrew/Cellar/mbedtls/3.5.0/lib/libmbedcrypto.13.dylib\n```\n\nThen setup the virtual environment and install the requirements with poetry:\n\n```zsh\npoetry install\n```\n\n### Mac with Intel\n\n(mbedtls updated to 3.5.0, so the version number is different)\n\n```zsh\ncp /usr/local/opt/mbedtls/lib/libmbedcrypto.15.dylib /usr/local/opt/mbedtls/lib/libmbedcrypto.13.dylib\n```\n\nI had an error `dyld[49347]: Library not loaded: '@loader_path/../../../../Python.framework/Versions/3.11/Python'` on poetry installation; solved by `pip install poetry`. The brew version won't work.\n\n### Not supported\n\nI don't know how to install these.\n\n- Windows\n- Linux\n- Mac with Intel\n\n## Use\n\n- Large model will cause the script to run slow: the recognition happens slower than a constantly speaking person, with M1 Ultra 128GB RAM.\n- Recommend to use small model: It's faster and the recognition is not bad.\n- See the comment in `config.yml` for more details.\n\n## Dev\n\n### Env setup\n\n- Suppose you have [`poetry`](https://python-poetry.org/) installed on your machine with `python@3.11`.\n- Assume `pwd` is the root of this repo.\n\n```zsh\npoetry install --with dev\npoetry run pre-commit install\ntouch .env\n```\n\n- Get an API key from DeepL and put it in `.env` in the root of this repo, like this:\n\n  ```plaintext\n  DEEPL_API_KEY=1234567890\n  ```\n\n- Maybe setup your own config file. #TODO: default config file\n- #TODO: use `make` to make this easier\n\n### Memo\n\nMost of this should be converted to GitHub Issues when published.\n\n- Trying to use [result](https://pypi.org/project/result/) to handle error, not sure how it feels.\n- The idea is to build something to substitute Otter to take notes.\n  - Check and try speech recognition package\n- Features:\n  - summary of the text with ChatGPT\n  - generate .SRT substitute\n  - Add DEV_MODE env variable, in which mode logger should be more verbose\n- UI:\n  - start/end control\n  - Not needed: Add a \"Still Recording...\" indicator every 5 seconds the input is idle.\n- For translation, it seems that [DeepL](https://www.deepl.com/translator) is the best option, but it's not free. Given I don't need it, just doing the most basic thing: translate the text with some API.\n- I tried to use [textual](https://github.com/Textualize/textual) but the CSS is not applied on the dynamically rendered list items. And it's not easy to use. Maybe use electron / eel instead if we want a web UI.\n\n## Special thanks\n\n- AI Model [OpenAI/whisper](https://github.com/openai/whisper)\n- real time transcript script [davabase/whisper_real_time](https://github.com/davabase/whisper_real_time)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpablolion%2Fwhisper-note","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpablolion%2Fwhisper-note","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpablolion%2Fwhisper-note/lists"}