https://github.com/saurabhav88/enviouswispr

Free, sub-second on-device AI dictation for macOS. Dual engines (Whisper + Parakeet) on Apple Silicon, polished by Apple Intelligence. No cloud, no account, no subscription.
https://github.com/saurabhav88/enviouswispr

accessibility apple-silicon coreml dictation macos on-device-ai open-source parakeet privacy productivity speech-to-text swift voice-typing whisper

Last synced: 18 days ago
JSON representation

Free, sub-second on-device AI dictation for macOS. Dual engines (Whisper + Parakeet) on Apple Silicon, polished by Apple Intelligence. No cloud, no account, no subscription.

Host: GitHub
URL: https://github.com/saurabhav88/enviouswispr
Owner: saurabhav88
License: other
Created: 2026-02-18T23:28:05.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-06-06T21:43:31.000Z (20 days ago)
Last Synced: 2026-06-06T22:18:33.927Z (20 days ago)
Topics: accessibility, apple-silicon, coreml, dictation, macos, on-device-ai, open-source, parakeet, privacy, productivity, speech-to-text, swift, voice-typing, whisper
Language: Swift
Homepage: https://enviouswispr.com
Size: 277 MB
Stars: 8
Watchers: 0
Forks: 0
Open Issues: 78
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

README

EnviousWispr

Talk naturally. Paste perfectly.

Free, on-device AI dictation and speech-to-text for macOS.

Powered by Apple Silicon. No cloud, no account, your voice never leaves your Mac.

EnviousWispr hero - Talk naturally, paste perfectly

---

## Demo

https://github.com/user-attachments/assets/e636e1a0-a0d1-4f7c-be0a-b7c907c6d5ab

## What is this?

EnviousWispr is a free AI dictation app for macOS that runs entirely on-device. It uses Whisper and Parakeet speech-to-text models on Apple Silicon to transcribe your voice locally, polishes the output with an optional LLM, and pastes clean text into whatever app you're working in. Transcription is sub-second; with optional AI polish, the full hotkey-to-paste flow typically lands in around a second and a half.

No cloud. No account required. No subscription. No audio ever leaves your Mac. Works fully offline.

## Why EnviousWispr?

| | EnviousWispr | Cloud dictation services |
|---|---|---|
| **Privacy** | 100% on-device transcription | Audio uploaded to servers |
| **Speed** | Sub-second transcription, paste-on-stop | Network round-trip latency |
| **Models** | Parakeet v3 (NVIDIA NeMo) + WhisperKit (OpenAI Whisper) | Single vendor model |
| **Polish** | Optional. Fully on-device (Apple Intelligence, Ollama) or bring-your-own-key cloud (GPT, Gemini) | Cloud polish, included in subscription |
| **Cost** | Free. No account, no subscription | Monthly subscription |
| **Works offline** | Yes, fully functional without internet | No |

## How it works

```
Press hotkey --> Record --> Transcribe --> Polish (optional) --> Paste
~0ms live ~400-800ms ~200-500ms instant
```

1. **Press your hotkey** from any app. Push-to-talk, toggle, or hands-free (double-press to lock for long-form), your choice.
2. **Speak naturally.** Silero VAD detects when you stop talking and ends recording automatically.
3. **On-device transcription.** Choose Parakeet v3 (fastest, 25 European languages) or WhisperKit (99 languages, with automatic language detection).
4. **AI polish** (optional). Clean up grammar, punctuation, and formatting. Runs fully on-device with Apple Intelligence (macOS 26+) or Ollama, or in the cloud via OpenAI or Gemini with your own API key.
5. **Text lands in your clipboard** and optionally auto-pastes into the active app.

> See the full interactive pipeline demo at [enviouswispr.com/how-it-works](https://enviouswispr.com/how-it-works)

## Supported Models

| Model | Best for | Languages | Disk space | Hardware |
|---|---|---|---|---|
| **Parakeet TDT v3** | Fastest dictation (default) | 25 European languages | ~460 MB | Apple Silicon |
| **WhisperKit** (Whisper Large v3 Turbo) | Broadest language coverage and automatic language detection | 99 languages | ~1.6 GB | Apple Silicon |

Both models run entirely on-device using CoreML. First launch downloads and compiles the model; subsequent launches are instant.

## Features

- 🎙️ **Dual ASR engines** with [Parakeet v3](https://github.com/FluidInference/FluidAudio) (NVIDIA NeMo) and [WhisperKit](https://github.com/argmaxinc/argmax-oss-swift) (OpenAI Whisper)
- ✨ **AI polish that respects your words** — strips filler words and false starts, fixes grammar and punctuation, formats numbers, dates, and URLs, and honors your custom vocabulary, all in your spoken language (never translated or rewritten)
- 🔒 **Polish that can stay private** — run it fully on-device with Apple Intelligence (macOS 26+) or Ollama, or in the cloud via OpenAI GPT / Google Gemini with your own API key
- 🌍 **Multilingual with automatic language detection** — speak in any supported language and EnviousWispr detects it, then offers to lock it in for faster, more accurate transcription
- 😀 **Speak an emoji** — say the emoji's name followed by "emoji" (like "thumbs up emoji") and the glyph drops right in
- ✋ **Voice Activity Detection** via Silero VAD — stops recording automatically when you stop talking
- 📚 **Custom vocabulary** for names, brands, and technical terms the ASR might miss
- ⌨️ **Global hotkey** with push-to-talk, toggle, and hands-free modes (double-press to lock for long-form dictation)
- 📋 **Auto-paste** directly into the active app, or just copy to clipboard
- 🕘 **Transcript history** for browsing, searching, and reviewing past dictations, with one-click re-polish to re-run AI cleanup on any past transcript
- 🧭 **Menu bar native** with minimal footprint
- 🔄 **Auto-updates** via Sparkle

## Quick Start

1. Download [EnviousWispr.dmg](https://github.com/saurabhav88/EnviousWispr/releases/latest/download/EnviousWispr.dmg) from the latest release
2. Drag to Applications, launch
3. Grant **Microphone**, **Accessibility**, and (on first paste fallback) **Automation** permissions when prompted
4. Set your preferred hotkey in Settings > Shortcuts
5. Start talking

**Optional:** Turn on AI polish in Settings > AI Polish. Keep it fully on-device with Apple Intelligence (macOS 26+) or Ollama, or add an OpenAI or Gemini API key.

## Requirements

- macOS 14 (Sonoma) or later
- Apple Silicon (M1 or newer)

## Building from Source

```bash
git clone https://github.com/saurabhav88/EnviousWispr.git
cd EnviousWispr
swift build # compiles the Swift packages (dependencies resolve via SPM)
```

The runnable `.app` is assembled by the Xcode build engine via Tuist, not by `swift build` — use `./scripts/build-dev-app.sh` for a local dev build, or the release path below. First build takes several minutes as ML models compile.

For a distributable `.app` bundle and DMG:

```bash
./scripts/build-release-dmg.sh
```

The release build runs on the Xcode engine via Tuist, so it requires full Xcode (26+) plus mise and Tuist; set `CODESIGN_IDENTITY` to sign. Running the app itself requires macOS 14+.

## Architecture

The app follows a pipeline state machine: **idle --> recording --> transcribing --> polishing --> complete**.

Key design choices:
- **Swift 6 strict concurrency** with full actor isolation
- **Dual pipeline architecture** with deliberately separate Parakeet and WhisperKit backends (isolation is a feature, not tech debt)
- **Heart & Limbs pattern** where the critical path (audio, ASR, paste) never fails, and features (polish, custom words, filler removal) degrade gracefully
- **Local-first** with LLM polish as an opt-in enhancement using your own keys

## Contributing

Contributions are welcome. Please open an issue to discuss significant changes before submitting a PR.

This project uses conventional commits: `feat(scope):`, `fix(scope):`, `refactor(scope):`.

## Privacy

EnviousWispr is built on a simple principle: **your voice is yours.**

- Audio is captured, transcribed, and discarded locally. Nothing is uploaded, stored, or shared.
- LLM polish (if enabled) can run entirely on your Mac with Apple Intelligence or a local Ollama model, so the polish step makes no network call. If you pick a cloud provider (OpenAI or Gemini), only text is sent (your transcript plus the polish instructions) using your own API key. Audio is never sent.
- Anonymous product analytics (PostHog) can be disabled in Settings.
- Crash reporting (Sentry) contains no transcript content, audio, or personal data.

## Connect

- **Website:** [enviouswispr.com](https://enviouswispr.com)
- **X:** [@EnviousLabs](https://x.com/EnviousLabs)
- **Email:** hello@enviouswispr.com

Built by [Envious Labs](https://enviouslabs.co)

## License

EnviousWispr is source-available under the [Business Source License 1.1](LICENSE). You may view, fork, and modify the code for personal, non-commercial use. Commercial use requires a license from Envious Labs. The code converts to Apache 2.0 on March 10, 2030.

For commercial licensing inquiries: hello@enviouswispr.com

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/saurabhav88/enviouswispr

Awesome Lists containing this project

README

EnviousWispr