{"id":50726392,"url":"https://github.com/sharadcodes/whisper-typer","last_synced_at":"2026-06-10T04:31:59.597Z","repository":{"id":348504099,"uuid":"1173260689","full_name":"sharadcodes/whisper-typer","owner":"sharadcodes","description":"Push-to-talk voice transcription using Faster-Whisper. Supports Windows, macOS, and Linux.","archived":false,"fork":false,"pushed_at":"2026-04-05T15:06:57.000Z","size":303,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-05T06:40:09.504Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sharadcodes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-05T07:08:27.000Z","updated_at":"2026-04-02T11:29:46.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sharadcodes/whisper-typer","commit_stats":null,"previous_names":["sharadcodes/whisper-typer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sharadcodes/whisper-typer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharadcodes%2Fwhisper-typer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharadcodes%2Fwhisper-typer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharadcodes%2Fwhisper-typer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharadcodes%2Fwhisper-typer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sharadcodes","download_url":"https://codeload.github.com/sharadcodes/whisper-typer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sharadcodes%2Fwhisper-typer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34137570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-10T04:31:58.945Z","updated_at":"2026-06-10T04:31:59.588Z","avatar_url":"https://github.com/sharadcodes.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Whisper Typer\n## Testing Phase - Download builds from [actions](https://github.com/sharadcodes/whisper-typer/actions)\n\nPush-to-talk voice transcription using Faster-Whisper.\nSupports Windows (works with package install or direct run), macOS (broken), and Linux (not tested).\n\n## Quick Start\n\n1. **Start the app**:\n   ```powershell\n   uv run run.py\n   ```\n\n2. **In the app**:\n   - The server auto-starts on launch.\n   - Choose a **Model** and **Input Mode** (Live or Full Capture).\n   - Use the Global Hotkey: **Ctrl+Win** (Windows) or **Ctrl+Cmd** (macOS).\n   - **Hold** keys to record, release to stop and transcribe.\n   - **Quick Double-Tap** to enter \"Hands-free\" mode (press again to stop).\n   - Text types into your active window automatically.\n\n---\n\n## Installation\n\nIf you want to install it as a global tool:\n```powershell\nuv pip install -e .\nwhisper-typer\n```\n\n---\n\n## Flow logic\n\n```mermaid\n%%{init: {\"flowchart\": {\"htmlLabels\": false}} }%%\nflowchart TD\n    A[\"User Hotkey\"] --\u003e B[\"Audio Input Stream\"]\n    C{\"Input Mode\"}\n    C --\u003e|Live typing| D[\"Silence-based Chunking\"]\n    C --\u003e|Full Capture| E[\"Full Recording Capture\"]\n    D --\u003e F[\"Transcription Queue (FIFO)\"]\n\n    E --\u003e F\n    F --\u003e G[\"Server API (Transcribe)\"]\n    G --\u003e H[\"Transcription Service\"]\n    H --\u003e I[\"Text Output\"]\n    I --\u003e J[\"Keyboard Typing to Active Window\"]\n```\n\n- User triggers hotkey (**Ctrl+Win** or **Ctrl+Cmd**).\n- Audio is captured from input stream.\n- App checks selected mode:\n  - **Live typing** → chunks split by silence windows and enqueued.\n  - **Full Capture** → all chunks captured until stop, then enqueued.\n- Queue processes each chunk in order (FIFO).\n- For each chunk:\n  - Send audio to server via API.\n  - Server returns transcribed text.\n  - Text is typed into the active window via keyboard simulation.\n\n---\n\n## Hotkeys \u0026 Auto-typing\n\nThe client runs a global low-level hotkey listener:\n\n- **Ctrl+Win** (Windows) or **Ctrl+Cmd** (macOS).\n- **Hold to Record**: Recording stays active as long as keys are held. Releasing either key stops and triggers transcription.\n- **Hands-free (Toggle)**: Double-tap the combo quickly to stay in recording mode after release. Tap again to stop.\n- When recording is stopped, the client waits for the transcription and then **simulates keyboard typing** to insert the text into the currently focused window.\n\n\u003e **macOS Users:** \n\u003e 1. You must grant **Accessibility** permissions to your terminal (e.g., iTerm or Terminal.app) for the auto-typing to work.\n\u003e 2. Grant **Microphone** permissions when prompted.\n\n### System tray icon colors\n\n| State | Color | Meaning |\n|-------|-------|---------|\n| Idle (server online) | 🟢 Green | Server is running, ready to transcribe |\n| Server offline | ⚫ Black | Server is not reachable |\n| Recording | 🔴 Red | Audio is being captured |\n| Processing | 🟣 Purple | Transcribing audio |\n\n---\n\n## Requirements\n\n- **OS:** Windows, macOS, or Linux\n- **Python:** 3.10+\n- **Package manager:** [uv](https://github.com/astral-sh/uv) (recommended)\n- **Docker:** Optional, for isolated container deployment\n\n---\n\n## Configuration\n\nThe application stores data in `~/.whisper-typer/` by default. You can customize settings using a `.env` file in the project root:\n\n- `WHISPER_MODEL`: Default model (e.g., `tiny`, `small`, `medium`).\n- `WHISPER_MODELS_DIR`: Custom path for model storage. Use an **absolute path** (for example `D:/AI/whisper-models` on Windows or `/absolute/path/to/models` on Linux/macOS) so the client and server always use the same directory.\n- `HF_TOKEN`: Hugging Face token for private models.\n\n---\n\n## Contributing\n\nContributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details.\n\n---\n\n## License\n\nThis project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details.\n\n---\n\n## About the Author\n\n**Sharad Raj Singh Maurya**  \nAI Engineer and Open Source enthusiast.  \n\n- **GitHub:** [@sharadcodes](https://github.com/sharadcodes)  \n- **Project:** [Whisper Typer](https://github.com/sharadcodes/whisper-typer)  \n\nFeel free to reach out for collaborations or to report any issues!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharadcodes%2Fwhisper-typer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsharadcodes%2Fwhisper-typer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsharadcodes%2Fwhisper-typer/lists"}