{"id":23092020,"url":"https://github.com/thiswillbeyourgithub/quick-whisper-typer","last_synced_at":"2025-10-06T10:32:49.417Z","repository":{"id":203286100,"uuid":"709249169","full_name":"thiswillbeyourgithub/Quick-Whisper-Typer","owner":"thiswillbeyourgithub","description":"Simple shell script to enter text using whisper using yad, openai, possibly chatgpt","archived":false,"fork":false,"pushed_at":"2025-03-27T18:41:26.000Z","size":333,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-04T09:51:11.843Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thiswillbeyourgithub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-24T10:34:46.000Z","updated_at":"2025-03-27T18:41:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"6ead8412-b5c2-4b78-9e79-bf3dd803db02","html_url":"https://github.com/thiswillbeyourgithub/Quick-Whisper-Typer","commit_stats":{"total_commits":237,"total_committers":1,"mean_commits":237.0,"dds":0.0,"last_synced_commit":"07bc86863f51edc83aae015eddd09b0c05bfd2cc"},"previous_names":["thiswillbeyourgithub/quick-whisper-typer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/thiswillbeyourgithub/Quick-Whisper-Typer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thiswillbeyourgithub%2FQuick-Whisper-Typer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thiswillbeyourgithub%2FQuick-Whisper-Typer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thiswillbeyourgithub%2FQuick-Whisper-Typer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thiswillbeyourgithub%2FQuick-Whisper-Typer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thiswillbeyourgithub","download_url":"https://codeload.github.com/thiswillbeyourgithub/Quick-Whisper-Typer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thiswillbeyourgithub%2FQuick-Whisper-Typer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259410149,"owners_count":22852970,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-16T21:26:48.125Z","updated_at":"2025-10-06T10:32:44.385Z","avatar_url":"https://github.com/thiswillbeyourgithub.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Quick Whisper Typer\nSuper simple python script to start recording sound, send it to whisper then have it type for you anywhere.\n* Can also modify text according to voice commands.\n* Latency is as low as I could (instant if deepgram is used, \u003c1s for openai's whisper).\n* It can be seen as a minimalist alternative to [AquaVoice](https://withaqua.com/) and can be extended easily to replace [Deepgram's Shortcut feature](https://deepgram.com/learn/introducing-shortcut-by-poised-voice-ai-tool).t\n\n## The way each task works\n### write\n1. starts recording\n2. when you're done press shift (escape or spacebar to cancel)\n3. whisper will transcribe your speech\n4.a if `--auto_paste` is True: your current clipboard will be saved, replaced by the transcription, \"ctrl+v\" will automatically be pressed, then your old clipboard will replace again like nothing happened.\n4.b if `--auto_paste` is False: your clipboard will be replaced by the transcription\n### transform_clipboard\n1. starts recording\n2. when you're done press shift (escape or spacebar to cancel)\n3. whisper will transcribe your speech\n4. the transcription will be interpreted as an instruction for `--llm_model` on how to transform the text found in your clipboard\n5. the result will either be pasted or stored in the clipboard like for `--task=write`\n### new_voice_chat\n1. starts recording\n2. when you're done press shift (escape or spacebar to cancel)\n3. whisper will transcribe your speech\n4. the transcription will be interpreted as the first user message in a conversation with `--llm_model`\n5. the result will either be pasted or stored in the clipboard like for `--task=write`, and optionaly read aloud if `--voice_engine` is set\n6. To continue the conversation, use the task `--task=continue_voice_chat`\n\n# Examples\n* I want to write text: `python quick_whisper_typer.py --task=write --auto_paste`\n* I want to translate text: copy the text in to the clipboard then `python quick_whisper_typer.py --task=transform_clipboard --auto_paste`\n* I want to start a vocal conversation: `python quick_whisper_typer.py --task=\"new_voice_chat\" --voice_engine='openai'`\n* I want to continue the conversation: `python quick_whisper_typer.py --task=\"continue_voice_chat\" --voice_engine='openai'`\n* I want to call it from anywhere without setting up keybindings, use `--loop` then press `shift` key several times from anywhere and you'll see a notification appear to trigger the tasks.\n\n\n## Features\n* Supports any spoken languages supported by whisper\n* Supports both openai's whisper and [deepgram's whisper](deepgram.com)\n* Supports for local transcription by supplying a custom URL.\n    * For example start [whispercpp](https://github.com/ggerganov/whisper.cpp) with `./server -m models/small_acft_q8_0.bin --threads 8 --audio-ctx 1500 -l fr --no-gpu --debug-mode --convert -p 1` ([models from FUTO](https://github.com/futo-org/whisper-acft/)) and use `--custom_transcription_url=\"http://127.0.0.1:8080/inference\"`\n    * You can set these environment variables for custom transcription:\n        * `CUSTOM_WHISPER_API_KEY`: API key for the custom transcription server\n        * `CUSTOM_WHISPER_MODEL`: Model name to use with the custom transcription server\n* Minimalist code\n* Low latency: it starts as fast as possible to be ready to listen to you\n* Four supported voice_engine: openai, [piper](https://github.com/rhasspy/piper), [deepgram](deepgram.com), espeak (fallback if any of the other fails)\n* Optional audio cleanup and long silence removal via sox\n* `--loop` to trigger the script from anywhere just by pressing shift multiple times. You can define any king of argument to customize your loop shortcuts by passing a dict to `--loop_tasks`\n* Support virtually any type of LLM (ChatGPT, Claude, Huggingface, Llama, etc) thanks to [litellm](https://docs.litellm.ai/).\n* Supposedly multiplatform, but I can't test it on anything else than Linux so please open an issue to tell me how it went!\n\n## How to\n* Make sure your environment contains the appropriate api keys (eg as OPENAI_API_KEY, MISTRAL_API_KEY, DEEPGRAM_API_KEY etc)\n* *optional: add a keyboard shortcut to call this script. See my i3 bindings below.*\n* If using deepgram: make sure you are on python 3.10+\n* create a venv: `uv venv --python 3.10` and activate it `source .venv/bin/activate`\n* `pip install -r requirements.txt`\n    * if you have issues installing the python package `playsound`, try installing `playsound3` instead.\n\n### Run in the background using systemd units\n\nTo always have quick_whisper_typer running in the background, you can do this after modifying the `quick_whisper_typer_launcher.sh` file and `chmod +x it`:\n```shell\nmkdir -p ~/.config/systemd/user/\necho \"[Unit]\nDescription=quick_whisper_typer\nAfter=graphical-session.target\n\n[Service]\nType=simple\nExecStart=[YOUR_APPROPRIATE_PATH]/Quick_Whisper_Typer/quick_whisper_typer_launcher.sh\nRestart=on-failure\nRestartSec=10\n\n[Install]\n; WantedBy=graphical-session.target\nWantedBy=default.target\n\" \u003e ~/.config/systemd/user/quick_whisper_typer.service\n\nsystemctl --user enable quick_whisper_typer.service\nsystemctl --user start quick_whisper_typer.service\n```\n\n### i3 bindings\n```\nmode \"$mode_launch_microphone\" {\n    # enter text\n    bindsym f exec /PATH/TO/quick_whisper_typer.py --task write, mode \"default\n    # edit clipboard\n    bindsym e exec /PATH/TO/quick_whisper_typer.py --task=transform_clipboard, mode \"default\"\n    bindsym v exec /PATH/TO/quick_whisper_typer.py --task=continue_voice_chat, mode \"default\"\n    bindsym shift+V exec /PATH/TO/quick_whisper_typer.py --task=new_voice_chat, mode \"default\"\n\n    bindsym Return mode \"default\"\n    bindsym Escape mode \"default\"\n    }\n```\n\n# Credits\n* `.ogg` files were in my `/usr/share/sounds/ubuntu/notifications` folder.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthiswillbeyourgithub%2Fquick-whisper-typer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthiswillbeyourgithub%2Fquick-whisper-typer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthiswillbeyourgithub%2Fquick-whisper-typer/lists"}