{"id":37599727,"url":"https://github.com/sofiashendi/hush","last_synced_at":"2026-01-22T02:02:11.000Z","repository":{"id":331517068,"uuid":"1126904034","full_name":"sofiashendi/hush","owner":"sofiashendi","description":"Speech-to-text MacOS app that provides privacy. Built with React.js, TypeScript and Electron.","archived":false,"fork":false,"pushed_at":"2026-01-10T18:14:30.000Z","size":2167,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-10T20:10:51.850Z","etag":null,"topics":["cloudflare-workers","privacy","speech-recognition","speech-to-text","whisper-ai"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sofiashendi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":"sofiashendi","thanks_dev":null,"custom":null}},"created_at":"2026-01-02T19:40:12.000Z","updated_at":"2026-01-06T16:18:12.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sofiashendi/hush","commit_stats":null,"previous_names":["sofiashendi/hush"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/sofiashendi/hush","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sofiashendi%2Fhush","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sofiashendi%2Fhush/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sofiashendi%2Fhush/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sofiashendi%2Fhush/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sofiashendi","download_url":"https://codeload.github.com/sofiashendi/hush/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sofiashendi%2Fhush/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478049,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T06:30:42.265Z","status":"ssl_error","status_checked_at":"2026-01-16T06:30:16.248Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloudflare-workers","privacy","speech-recognition","speech-to-text","whisper-ai"],"created_at":"2026-01-16T10:00:52.598Z","updated_at":"2026-01-16T10:01:09.968Z","avatar_url":"https://github.com/sofiashendi.png","language":"TypeScript","funding_links":["https://buymeacoffee.com/sofiashendi"],"categories":[],"sub_categories":[],"readme":"# Hush\n\nA sleek, native-feeling macOS application that captures speech, converts it to text using **Local Intelligence (Whisper.cpp)**, and copies it to your clipboard for instant pasting.\n\nCreated by Sofia Shendi\nhttps://sofiashendi.com\n\n\u003cimg width=\"1394\" height=\"881\" alt=\"Idle State\" src=\"https://github.com/user-attachments/assets/d5104a6c-92b7-4b79-b956-d50b6d22fa57\" /\u003e\n\n## Features\n- **Global Shortcut**: Press `Cmd + '` (Single Quote) to toggle the recorder from anywhere.\n- **Native UI**: Draggable, transparent, and non-intrusive design.\n- **Private \u0026 Offline**: Uses local `whisper.cpp` inference. No audio ever leaves your device. Works without internet.\n- **Instant Speed**: Optimized for Apple Silicon (Metal) for sub-second transcription.\n- **Continuous Flow**: Speak, pause, and watch it paste. The mic stays open so you can keep dictating.\n- **Auto-Paste**: Automatically types your text into the active window (requires Accessibility permission).\n\n## Models\nHush will automatically download the **Base** model on first launch, which is fast and accurate for general dictation.\nYou can optionally download **Small** or **Large Turbo** models in the settings for higher accuracy.\n\n## Prerequisites\n- **macOS**: 12+ (Apple Silicon recommended for best performance).\n- **Node.js**: v18+ installed (for development).\n- **Permissions**: The app needs **Microphone** access to hear you, and **Accessibility** access to Auto-Paste text.\n\n## Build \u0026 Install\n\nSince this app is not notarized by Apple, you must build it yourself for local usage:\n\n1.  **Clone the repo**\n    ```bash\n    git clone https://github.com/sofiashendi/hush.git\n    cd hush\n    ```\n\n2.  **Install Dependencies**\n    ```bash\n    npm install\n    ```\n\n3.  **Build the App**\n    ```bash\n    npm run dist\n    ```\n    The `.dmg` installer will be in the `dist/` folder.\n    *Note: On first launch, the app will download the base model (~60MB), which may take a moment.*\n\n4.  **Install \u0026 Open**\n    -   Open the `.dmg` and drag the app to Applications.\n    -   **First launch:** Right-click → Open → Click \"Open\" in the dialog.\n    -   Grant **Microphone** and **Accessibility** permissions when prompted.\n\n## Development\n\nTo run in development mode with hot reload:\n\n```bash\nnpm run dev\n```\n\n*Note: If \"Auto-Paste\" fails, go to **System Settings \u003e Privacy \u0026 Security \u003e Accessibility** and ensure Hush is allowed.*\n\n## Usage (Continuous Mode)\n1.  **Focus**: Click on the text field where you want to type (e.g., Notion, Words, VS Code).\n2.  **Toggle**: Press `Cmd + '`.\n3.  **Speak**: Say your sentence clearly.\n4.  **Pause**: Stop speaking for ~1.5 seconds.\n5.  **Watch**: The app will automatically transcribe and paste your text.\n6.  **Repeat**: Keep speaking the next sentence.\n7.  **Stop**: Press `Cmd + '` again when done.\n\n## Technical Decisions\n\n### Why Local whisper.cpp Over Cloud APIs\n\n1. **Privacy First**: Audio never leaves the device. No data collection, no cloud processing, no API keys to manage.\n2. **Offline Capability**: Works without internet after initial model download. Perfect for air-gapped environments or travel.\n3. **Zero Latency Variability**: No network round-trips. Transcription time is consistent and predictable.\n4. **Cost**: No per-minute API charges. One-time model download, unlimited usage.\n\nTrade-off: Requires ~60-550MB model download and uses local GPU/CPU resources.\n\n### Voice Activity Detection (VAD) Architecture\n\nThe app uses a custom VAD implementation with three volume thresholds:\n\n| Threshold | Value | Purpose |\n|-----------|-------|---------|\n| `LOW_VOLUME_THRESHOLD` | 3.0 | Minimum RMS to process audio (filters silence) |\n| `SILENCE_THRESHOLD` | 5.0 | Below this = silence detected |\n| `SPEAKING_THRESHOLD` | 8.0 | Above this = active speech |\n\n**Flow:**\n1. Audio analyzed via `requestAnimationFrame` loop (60fps)\n2. RMS calculated from `AnalyserNode` frequency data\n3. When RMS drops below silence threshold for \u003e1 second after speech, segment is flushed\n4. Transcription runs while recording continues (no audio gaps)\n\nThis approach enables \"continuous dictation\" - speak, pause, watch it paste, speak again - without restarting.\n\n### Transcription Queue Pattern\n\n```typescript\ntranscriptionQueueRef.current = transcriptionQueueRef.current.then(async () =\u003e {\n  await processAudio(blob, isSegment, sessionMaxVolume);\n});\n```\n\n**Problem Solved**: Multiple VAD-triggered segments could race, causing out-of-order transcription.\n\n**Solution**: Promise chain ensures segments are processed sequentially while recording continues in parallel.\n\n### Why Metal GPU Acceleration\n\nwhisper.cpp supports Apple Metal for M1/M2/M3 GPU inference. Benefits:\n- 5-10x faster than CPU-only\n- Lower power consumption\n- Keeps CPU free for other tasks\n\nThe app dynamically resolves Metal shader paths for both development and packaged builds.\n\n### Hallucination Filtering\n\nWhisper occasionally generates phantom text on silence. Common patterns filtered:\n- CJK characters (when speaking English)\n- Repetitive words (\"the the the\")\n- Bracketed annotations (\"[MUSIC]\", \"[APPLAUSE]\")\n- Known hallucination phrases\n\nFiltering happens client-side before clipboard paste.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsofiashendi%2Fhush","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsofiashendi%2Fhush","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsofiashendi%2Fhush/lists"}