{"id":44130011,"url":"https://github.com/app-vox/vox","last_synced_at":"2026-03-16T17:07:02.532Z","repository":{"id":337287381,"uuid":"1152960251","full_name":"app-vox/vox","owner":"app-vox","description":"Your privacy-first voice-to-text tool. Local Whisper transcription with optional LLM correction so your audio never leaves your computer.","archived":false,"fork":false,"pushed_at":"2026-02-15T20:31:29.000Z","size":13714,"stargazers_count":12,"open_issues_count":47,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-15T20:47:19.071Z","etag":null,"topics":["dictation","electron","llm","macos","menu-bar-app","privacy","productivity","react","speech-recognition","typescript","voice-to-text","whisper"],"latest_commit_sha":null,"homepage":"https://app-vox.github.io/vox","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/app-vox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-08T17:36:21.000Z","updated_at":"2026-02-15T12:51:18.000Z","dependencies_parsed_at":"2026-02-15T16:02:12.367Z","dependency_job_id":null,"html_url":"https://github.com/app-vox/vox","commit_stats":null,"previous_names":["app-vox/vox"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/app-vox/vox","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/app-vox%2Fvox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/app-vox%2Fvox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/app-vox%2Fvox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/app-vox%2Fvox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/app-vox","download_url":"https://codeload.github.com/app-vox/vox/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/app-vox%2Fvox/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29639808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-19T22:32:43.237Z","status":"online","status_checked_at":"2026-02-20T02:00:07.535Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dictation","electron","llm","macos","menu-bar-app","privacy","productivity","react","speech-recognition","typescript","voice-to-text","whisper"],"created_at":"2026-02-08T22:06:17.974Z","updated_at":"2026-03-16T17:07:02.525Z","avatar_url":"https://github.com/app-vox.png","language":"TypeScript","funding_links":["https://buymeacoffee.com/rodrigoluizs"],"categories":[],"sub_categories":[],"readme":"# Vox\n\nOpen-source voice-to-text app with local Whisper transcription and AI-powered correction.\n\n[![Website](https://img.shields.io/badge/Website-usevox.app-6366f1?style=flat\u0026logo=safari\u0026logoColor=white)](https://usevox.app/)\n[![CI](https://github.com/app-vox/vox/actions/workflows/ci.yml/badge.svg)](https://github.com/app-vox/vox/actions/workflows/ci.yml)\n[![Release](https://github.com/app-vox/vox/actions/workflows/release.yml/badge.svg)](https://github.com/app-vox/vox/actions/workflows/release.yml)\n[![codecov](https://codecov.io/gh/app-vox/vox/graph/badge.svg)](https://codecov.io/gh/app-vox/vox)\n[![License: FSL-1.1-ALv2](https://img.shields.io/badge/License-FSL--1.1--ALv2-blue.svg)](LICENSE)\n[![macOS](https://img.shields.io/badge/macOS-12.0+-000000?logo=apple\u0026logoColor=white)](https://www.apple.com/macos/)\n[![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-ffdd00?logo=buy-me-a-coffee\u0026logoColor=black)](https://buymeacoffee.com/rodrigoluizs)\n\nHold a keyboard shortcut, speak, and Vox transcribes your voice locally using [whisper.cpp](https://github.com/ggerganov/whisper.cpp), optionally corrects it with AI, and pastes the text into your active app.\n\n## Demo\n\n\u003cdiv align=\"center\"\u003e\n\n![Vox Demo](docs/images/demo.gif)\n\n\u003c/div\u003e\n\n\u003e ⚠️ **Platform Support**\n\u003e Vox currently runs on **macOS** (Apple Silicon and Intel). Cross-platform support for Windows and Linux is planned for future releases.\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n- [Features](#features)\n- [Use Cases](#use-cases)\n- [How Vox Compares](#how-vox-compares)\n- [Requirements](#requirements)\n- [Configuration](#configuration)\n- [Usage](#usage)\n- [FAQ](#faq)\n- [Development](#development)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Quick Start\n\nDownload the latest version from the [releases page](https://github.com/app-vox/vox/releases/latest) and drag `Vox.app` to your Applications folder.\n\n## First Launch\n\nWhen you first launch Vox, you'll need to:\n\n1. **Download a Whisper Model** — Go to Settings \u003e Local Model and download at least one speech recognition model. The \"small\" model (Recommended) is a good starting point.\n\n2. **Grant Permissions** — Vox needs:\n   - **Microphone**: Required for voice recording\n   - **Accessibility**: Required for keyboard shortcuts and auto-paste\n\n3. **Configure Shortcuts** (optional) — Customize keyboard shortcuts in Settings \u003e Shortcuts\n\n4. **Enable AI Improvements** (optional) — Configure LLM provider in Settings \u003e AI Improvements\n\nVox will guide you through this setup process with visual indicators showing what's incomplete.\n\nOnce configured, hold `Alt+Space` to start recording.\n\n## Features\n\n- **🔒 100% Local transcription** — Powered by whisper.cpp, audio stays on your device\n- **🤖 AI correction** — Removes filler words and fixes grammar (optional)\n- **⚙️ Custom prompts** — Tailor corrections for medical, technical, creative, or any workflow\n- **⌨️ Hold or toggle modes** — Press-and-hold or toggle recording on/off\n- **📋 Auto-paste** — Text appears in your focused app via Cmd+V\n- **🎯 Multiple models** — Choose speed vs accuracy (tiny to large)\n- **☁️ Multiple LLM providers** — OpenAI-compatible or AWS Bedrock\n- **🎨 Menu bar app** — Runs quietly in the background with dark/light mode support\n\n## Use Cases\n\n### 👨‍⚕️ Medical Professionals\nPreserve medical terminology and standard abbreviations. Vox understands context and won't autocorrect \"OA\" to \"okay\" or \"PT\" to \"patient.\"\n\n**Example custom prompt:**\n\u003e \"Preserve medical terminology, standard abbreviations (e.g., OA, PT, BP), and format as clinical notes.\"\n\n### 👨‍💻 Developers \u0026 Engineers\nFormat technical dictation as concise documentation. Remove filler words while keeping technical terms intact.\n\n**Example custom prompt:**\n\u003e \"Format as technical documentation. Be concise, remove filler words, preserve code terms and abbreviations.\"\n\n### ✍️ Writers \u0026 Content Creators\nEnhance prose while maintaining your unique voice. Turn spoken ideas into polished text ready for editing.\n\n**Example custom prompt:**\n\u003e \"Enhance prose for readability while maintaining the author's voice. Fix grammar but keep the casual tone.\"\n\n### 🌍 Language Learners\nPractice speaking by translating and correcting your speech in real-time.\n\n**Example custom prompt:**\n\u003e \"Translate to German and correct grammar. Output only the German translation.\"\n\n### 📝 Note-Taking \u0026 Productivity\nCapture thoughts quickly without typing. Perfect for meetings, brainstorming, or journaling.\n\n## How Vox Compares\n\n| Feature | Vox | Dragon NaturallySpeaking | macOS Dictation | Whisper Desktop Apps |\n|---------|-----|--------------------------|-----------------|---------------------|\n| **Price** | Free \u0026 Open Source | $300+ | Free (limited) | Varies ($0-50) |\n| **Privacy** | 100% Local | Cloud-based | Cloud-based | Mostly local |\n| **Custom Prompts** | ✅ Full control | ❌ Limited | ❌ None | ⚠️ Some apps |\n| **AI Enhancement** | ✅ Your own API | ❌ None | ⚠️ Basic | ⚠️ Varies |\n| **Offline Mode** | ✅ Full | ⚠️ Limited | ❌ Requires internet | ✅ Most |\n| **macOS Native** | ✅ Menu bar app | ⚠️ Full app | ✅ Built-in | ✅ Varies |\n| **Custom Shortcuts** | ✅ Configurable | ✅ Yes | ⚠️ Limited | ✅ Most |\n| **Open Source** | ✅ FSL-1.1-ALv2 | ❌ Proprietary | ❌ Proprietary | ⚠️ Some |\n\n**Why Vox?**\n- **Privacy-first**: Your audio never leaves your device\n- **Flexibility**: Use any OpenAI-compatible LLM or AWS Bedrock\n- **Customization**: Tailor AI corrections to your exact needs\n- **Free \u0026 Open**: No subscription, no cloud lock-in\n\n## Requirements\n\n- **macOS** (Apple Silicon or Intel)\n- **LLM provider** (optional) — for text correction:\n  - OpenAI-compatible endpoint with API key\n  - Or AWS Bedrock credentials with model access\n\n## Configuration\n\n### Whisper Models\n\nDownload at least one model from the Whisper tab:\n\n| Model  | Size    | Speed  | Accuracy |\n|--------|---------|--------|----------|\n| tiny   | ~75 MB  | Fastest| Lower    |\n| base   | ~140 MB | Fast   | Decent   |\n| small  | ~460 MB | Good   | Good     |\n| medium | ~1.5 GB | Slow   | Better   |\n| large  | ~3 GB   | Slowest| Best     |\n\n### LLM Provider\n\n**Foundry (OpenAI-compatible)**\n- Endpoint URL\n- API key\n- Model name (e.g., `gpt-4o`)\n\n**AWS Bedrock**\n- AWS region\n- Credentials (access key, profile, or default chain)\n- Model ID (e.g., `anthropic.claude-3-5-sonnet-20241022-v2:0`)\n\n### Shortcuts\n\nCustomize keyboard shortcuts in the Shortcuts tab:\n- **Hold mode** (default: `Alt+Space`)\n- **Toggle mode** (default: `Alt+Shift+Space`)\n\n## Usage\n\nOnce configured, Vox runs as a menu bar icon.\n\nPress your shortcut to record. The floating indicator shows:\n- **Red** — Recording\n- **Yellow** — Transcribing\n- **Blue** — Correcting (if LLM enabled)\n\nRelease (hold mode) or press again (toggle mode) to stop. Text is pasted automatically.\n\nIf correction fails, raw transcription is used. If transcription is empty (silence/noise), nothing is pasted.\n\n## Development\n\n### Setup\n\n\u003e Requires [cmake](https://cmake.org/download).\n\n```bash\ngit clone https://github.com/app-vox/vox.git\ncd vox\nmake install   # installs npm deps + builds whisper.cpp\n```\n\n### Run\n\n```bash\nmake dev        # development with hot reload\nnpm test        # run tests\nnpm run dist    # build production app\n```\n\nBuilt with Electron, React, TypeScript, and whisper.cpp.\n\n## Contributing\n\nContributions welcome! To contribute:\n\n1. Fork and create a feature branch\n2. Make your changes\n3. Run `npm run typecheck \u0026\u0026 npm run lint \u0026\u0026 npm test`\n4. Commit with [Conventional Commits](https://www.conventionalcommits.org/) (e.g., `feat(audio): add noise gate`)\n5. Open a pull request\n\n⚠️ See more details in [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## FAQ\n\n### Is Vox really free?\nYes, Vox is 100% free and open-source. Transcription runs locally using Whisper.cpp. If you use optional AI enhancement, you'll need your own API keys (OpenAI-compatible or AWS Bedrock), but there are no fees from Vox.\n\n### Does my audio leave my device?\nNo. Transcription happens entirely on your Mac. Only if you enable AI enhancement does the *text* (not audio) get sent to your configured LLM provider for correction. Your audio recordings never leave your device.\n\n### What's the difference between local transcription and AI enhancement?\n- **Local transcription**: Whisper.cpp converts your speech to text on your device. Fast, accurate, 100% private.\n- **AI enhancement** (optional): Sends the transcribed *text* to an LLM to remove filler words (\"um\", \"uh\"), fix grammar, or apply custom corrections based on your prompt.\n\n### Which Whisper model should I use?\n- **Small** (~460MB): Best balance of speed and accuracy. Recommended for most users.\n- **Tiny/Base**: Faster but less accurate. Good for quick notes.\n- **Medium/Large**: Slower but more accurate. Good for technical/medical content or noisy environments.\n\nYou can switch models anytime in Settings.\n\n### Can I use Vox with Claude/ChatGPT/other LLMs?\nYes! Vox works with:\n- **OpenAI-compatible APIs**: OpenAI, Anthropic (via Bedrock), OpenRouter, local LLMs with OpenAI-compatible endpoints\n- **AWS Bedrock**: Claude, Llama, Mistral, and other Bedrock models\n\n### Does Vox work offline?\nYes. Local transcription works 100% offline. AI enhancement requires internet (since it calls your LLM provider API), but you can disable it and use raw transcription offline.\n\n### Why does Vox need Accessibility permissions?\nVox needs Accessibility access to:\n1. Listen for your custom keyboard shortcuts globally\n2. Simulate `Cmd+V` to paste transcribed text into your active app\n\nWithout this, Vox can't detect shortcuts or auto-paste text.\n\n### Can I contribute to Vox?\nAbsolutely! Vox is open-source. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. We welcome bug reports, feature requests, and pull requests.\n\n### What about Windows and Linux?\nmacOS-only for now, but cross-platform support is planned. Follow the repo for updates!\n\n## License\n\nThis project is licensed under the [Functional Source License, Version 1.1, ALv2 Future License](LICENSE).\n\nYou can use, modify, and redistribute the code for any purpose **except** building a competing commercial product or service. After two years, each release automatically converts to the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).\n\nSee [LICENSE](LICENSE) for full details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapp-vox%2Fvox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapp-vox%2Fvox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapp-vox%2Fvox/lists"}