https://github.com/jstarfilms/koe
Lightning-fast, privacy-focused voice dictation for desktop and mobile. Electron + Expo, powered by Groq.
https://github.com/jstarfilms/koe
android desktop-app electron expo expo-router groq ios mobile-app offline-vad onnxruntime react react-native speech-to-text vibecoding voice-dictation whisper
Last synced: about 2 months ago
JSON representation
Lightning-fast, privacy-focused voice dictation for desktop and mobile. Electron + Expo, powered by Groq.
- Host: GitHub
- URL: https://github.com/jstarfilms/koe
- Owner: JStaRFilms
- Created: 2026-02-28T11:23:13.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2026-03-19T20:03:37.000Z (2 months ago)
- Last Synced: 2026-03-20T11:15:15.128Z (2 months ago)
- Topics: android, desktop-app, electron, expo, expo-router, groq, ios, mobile-app, offline-vad, onnxruntime, react, react-native, speech-to-text, vibecoding, voice-dictation, whisper
- Language: TypeScript
- Homepage: https://koe-pink.vercel.app
- Size: 7.71 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Audit: audit_report.js
Awesome Lists containing this project
README
Koe (声)
Lightning-Fast, Privacy-First Voice Dictation for Windows, iOS, and Android
[](https://github.com/JStaRFilms/Koe/releases)
[](LICENSE)
[](https://electronjs.org/)
[](https://groq.com/)
---
## What is Koe?
**Koe** (声, Japanese for "voice") is a free, open-source alternative to subscription-based voice dictation tools. Press a hotkey (Desktop) or a button (Mobile), speak naturally, and get polished AI text typed at your cursor or copied to your clipboard.
Unlike cloud-based solutions that charge monthly fees, Koe uses your own [Groq API key](https://console.groq.com/keys) and stays free for up to 8 hours of transcription a day on Groq's free tier.
### Why Koe?
| Feature | WhisperFlow ($8+/mo) | Built-in OS Dictation | **Koe (Free)** |
|---------|---------------------|----------------------|----------------|
| Cost | Subscription | Free | **Free (BYOK)** |
| Accuracy | High | Poor | **High (Whisper)** |
| AI Enhancement | Yes | No | **Yes** |
| Privacy | Cloud audio | Local | **Local VAD + BYOK** |
| Global Hotkey | Yes | Limited | **Yes** |
| Auto-Paste | Yes | No | **Yes** |
---
- **Cross-Platform** — Native performance on Windows (Desktop) and iOS/Android (Mobile)
- **Global Hotkey (Desktop)** — Press `Ctrl + Shift + Space` anywhere to start or stop dictation
- **Clipboard-First (Mobile)** — High-fidelity audio capture with instant polished results copied to your clipboard
- **Pause Naturally** — Koe keeps listening through short pauses instead of treating every breath like the end of a recording
- **Rolling Segments** — Long recordings are processed in the background as ordered chunks, so performance stays fast even on longer sessions
- **Instant Transcription** — Groq Whisper handles speech-to-text at high speed
- **AI Text Enhancement** — Each segment is refined before it is committed, so only polished text is returned
- **Auto-Type (Desktop)** — Refined text is typed progressively into the focused text field while you are still talking
- **Minimalist UI** — A premium, high-contrast interface designed for focus and speed
- **Transcription History** — One-click copy and retry for saved transcripts
- **Usage Dashboard** — Track daily audio seconds, request pressure, and queue activity
---
### Desktop (Windows)
1. Download the latest `.exe` from [Releases](https://github.com/JStaRFilms/Koe/releases).
2. Install and launch. Koe will live in your system tray.
### Mobile (iOS & Android)
1. Clone the repo and navigate to `apps/mobile`.
2. Install [Expo Go](https://expo.dev/go) on your device.
3. Run `pnpm dev:mobile` and scan the QR code.
*Note: Native builds (.ipa/.apk) can be generated via EAS.*
### Build Everything from Source
```bash
# Clone the repository
git clone https://github.com/JStaRFilms/Koe.git
cd Koe
# Install all dependencies (Monorepo)
pnpm install
# Run Desktop
pnpm dev
# Build for production
pnpm build
# Run Mobile
pnpm dev:mobile
```
If `pnpm dev` fails with `Electron failed to install correctly`, pnpm likely skipped Electron's install script during dependency setup. This repo now allowlists the required build/install scripts for pnpm 10+, and an existing checkout can be repaired with:
```bash
pnpm rebuild electron esbuild protobufjs electron-winstaller
```
### Release Builds
- Real release artifacts should be built on GitHub Actions, not locally
- Push a matching version tag such as `v1.1.3` after updating `package.json`
- The release workflow will build Windows and macOS and attach artifacts to that GitHub Release
- See [docs/release-process.md](docs/release-process.md)
### Vercel Deployment
- The marketing website is the Next.js app in `koe-website/`
- In Vercel Project Settings -> Build and Deployment -> Root Directory, set the root to `koe-website`
- Leave the framework as Next.js for that project
- A root-level `vercel.json` is also included as a fallback so root builds target `koe-website`
### Requirements
- Windows 10/11 (for Desktop) or iOS/Android (for Mobile)
- [Groq API Key](https://console.groq.com/keys) (free tier available)
- Microphone access
---
## Quick Start
1. **Launch** Koe — it minimizes to your system tray
2. **Configure** — Right-click the tray icon → **Settings** → Enter your Groq API key
3. **Dictate** — Click any text field and press `Ctrl + Shift + Space`
4. **Speak** — The pill UI appears. Talk naturally, including pauses
5. **Done** — Press the hotkey again when you're finished. Koe finalizes the session, copies the full refined transcript, and keeps it in history
---
## Usage Guide
### Global Hotkeys
| Action | Shortcut |
|--------|----------|
| Start / Stop Recording | `Ctrl + Shift + Space` |
| Retry Last Failed / Latest Transcript | `Ctrl + Shift + ,` |
| Open Settings | Tray menu |
### How Recording Works
- Koe records one continuous session until you stop it
- Internally, it breaks longer recordings into ordered segments
- Segments are transcribed and refined in the background
- Refined text is typed in order as it becomes ready
- When the session ends, Koe keeps one full final transcript in clipboard and history
### The Pill UI
The floating pill is designed to stay out of the way while still telling you what matters:
- **Idle** — Waiting for the next dictation
- **Listening** — Live voice levels and active recording state
- **Warning** — Mic fallback or chunk failure without immediately killing the session
- **Processing** — Finalizing remaining work after you stop
- **Complete** — Brief success state before hiding
### Settings
Configure via the settings window (right-click the tray icon):
| Setting | Description | Default |
|---------|-------------|---------|
| Groq API Key | Your API key from [console.groq.com](https://console.groq.com/keys) | — |
| Language | Transcription language (`auto` for detection) | `auto` |
| Prompt Style | How Koe refines the transcript | `Clean` |
| Auto-Paste | Automatically type into the focused window | `enabled` |
| Theme | Dark / Light mode | `dark` |
---
Koe uses a shared core architecture to ensure consistency across Desktop and Mobile. Business logic lives in `@koe/core`, while platform-specific drivers handle audio and output.
See the [Detailed Architecture Guide](docs/Architecture.md) for more info.
### Platform Specifics
| Feature | Desktop (Windows) | Mobile (iOS/Android) |
|---------|-------------------|----------------------|
| **Trigger** | Global Hotkey | Capture Button |
| **Output** | Auto-Paste / Type | Clipboard-First |
| **Storage** | `electron-store` | `SecureStore` |
| **Capture Logic** | Local VAD + ordered segments | Metering-driven chunk rotation + ordered segments |
### Privacy-First Design
1. **Desktop speech detection** runs locally using ONNX WebAssembly
2. **Mobile recording control** stays on-device until a chunk is ready to transcribe
3. **Retry audio** is stored only for failed or unresolved segments
4. **Your API key** is stored locally on each platform and only used for transcription/refinement requests
---
## Tech Stack
| Layer | Technology |
|-------|------------|
| **Framework** | Electron + Vite |
| **Frontend** | Vanilla JavaScript, Custom CSS |
| **Audio Capture** | Web Audio API |
| **Voice Detection** | `@ricky0123/vad-web` (Silero VAD) |
| **Transcription** | Groq Whisper API (`whisper-large-v3-turbo`) |
| **Text Enhancement** | Groq chat refinement pipeline |
| **Storage** | `electron-store` + temp retry files |
| **Packaging** | `electron-builder` |
---
## Groq API Limits
Koe is designed to stay inside Groq's free-tier limits:
| Metric | Limit | Approximate Usage |
|--------|-------|-------------------|
| Requests per minute | 20 | ~6 transcribed segments/minute with paced refinement |
| Requests per day | 2,000 | ~8 hours of normal dictation |
| Audio per day | 28,800 sec | 8 hours |
The built-in scheduler tracks request pressure and keeps the app responsive while staying inside the cap.
---
## Roadmap
### Completed
- [x] Global hotkey toggle (Desktop)
- [x] Local VAD speech detection
- [x] Groq Whisper transcription
- [x] AI transcript refinement
- [x] Auto-paste to focused window (Desktop)
- [x] Transcription history & Usage dashboard
- [x] Mobile App (iOS/Android V1)
- [x] Shared Core Extraction
### Planned
- [x] Custom AI prompts
- [x] Keyboard shortcut customization
- [ ] Export history as `.txt` / `.md`
- [x] Native macOS support (Electron)
- [ ] Android IME (Custom Keyboard) implementation
### Future
- [ ] Snippet library with voice shortcuts
- [ ] App-specific tone profiles
- [ ] Cloud sync across devices
- [ ] Team collaboration features
See [Feature Requests](docs/issues/) for the full backlog.
---
## Contributing
Contributions are welcome. Please see the repo docs and existing code patterns before opening a PR.
### Development Setup
```bash
# Fork and clone
git clone https://github.com/your-username/Koe.git
cd Koe
# Install dependencies
pnpm install
# Start development
pnpm dev
```
### Monorepo Structure
Koe is transitioning to a monorepo to support multiple platforms:
- **Root**: Legacy Electron Desktop app and shared workspace configuration
- **`apps/mobile`**: Expo-based mobile client (iOS/Android)
- **`packages/koe-core`**: Shared business logic, types, and API services
### Development Commands
| Target | Command | Description |
|--------|---------|-------------|
| **Desktop** | `pnpm dev` | Start the Electron app in dev mode |
| **Mobile** | `pnpm dev:mobile` | Start the Expo development server |
| **Core** | `pnpm build:core` | Build the shared logic package |
| **All** | `pnpm type-check` | Run type-checking across all packages |
### Project Structure
```text
Koe/
├── apps/ # Application projects
│ └── mobile/ # Expo mobile app
├── packages/ # Shared logic
│ └── koe-core/ # Core services (Whisper, Sessions)
├── src/ # Legacy Desktop source
│ ├── main/ # Electron main process
│ └── renderer/ # UI code
├── docs/ # Documentation & Tasks
├── pnpm-workspace.yaml # Workspace config
└── package.json # Root manifest & scripts
```
---
## Troubleshooting
### "No audio detected"
- Ensure microphone permissions are granted in Windows Settings
- Check that your default recording device is selected
- If another app is holding the mic, Koe will try another available input and warn you in the pill UI
### "API rate limit exceeded"
- Wait for the per-minute window to clear
- Check the usage dashboard for queue pressure
- Very long continuous dictation can still pile up requests on the free tier
### "Auto-paste not working"
- Some applications block simulated keystrokes
- Disable auto-paste in Settings and use `Ctrl + V` manually
- Run Koe as administrator if the issue persists
### App won't launch
- Ensure you're on Windows 10/11 64-bit
- Check that [Visual C++ Redistributables](https://aka.ms/vs/17/release/vc_redist.x64.exe) are installed
- Check Windows Event Viewer for crash details
---
## Acknowledgments
- **Groq** — For the fast Whisper and chat APIs
- **Silero** — For the VAD model
- **@ricky0123** — For the `vad-web` library
- **WhisperFlow** — For helping prove the category exists
---
## License
Koe is licensed under the ISC License. See the [LICENSE](LICENSE) file for details.
---