https://github.com/v2matosevic/talkty-mac
Local, private speech-to-text for macOS — Apple Silicon + Metal, menu-bar app. Press a hotkey, speak, get text typed at your cursor.
https://github.com/v2matosevic/talkty-mac
apple-silicon dictation macos menu-bar metal on-device privacy speech-to-text swift whisper
Last synced: 20 days ago
JSON representation
Local, private speech-to-text for macOS — Apple Silicon + Metal, menu-bar app. Press a hotkey, speak, get text typed at your cursor.
- Host: GitHub
- URL: https://github.com/v2matosevic/talkty-mac
- Owner: v2matosevic
- License: mit
- Created: 2026-06-03T12:26:11.000Z (29 days ago)
- Default Branch: main
- Last Pushed: 2026-06-11T22:30:48.000Z (21 days ago)
- Last Synced: 2026-06-12T00:10:26.842Z (20 days ago)
- Topics: apple-silicon, dictation, macos, menu-bar, metal, on-device, privacy, speech-to-text, swift, whisper
- Language: Swift
- Size: 1.84 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Talkty for macOS
**Local speech-to-text for macOS, powered by Whisper.** Native Apple Silicon
rebuild of the original Windows app — Metal-accelerated, menu-bar resident, fully
on-device. No internet, no telemetry, audio never leaves your Mac.
Press a global hotkey, speak, get text on your clipboard (and optionally
auto-pasted at the cursor).
## Features
- **100% local** — whisper.cpp with Metal on Apple Silicon; nothing leaves the device
- **Fast** — RTF ~0.01–0.1 on an M-series GPU (an 11 s clip transcribes in ~0.1 s)
- **Menu-bar app** — press the hotkey from anywhere; a floating pill shows the live waveform
- **Type at the cursor** — optionally inserts text right where you're typing
(clipboard-safe keystroke injection) — ideal for editors and terminals
- **Recording volume ducking** — fades background audio down while you speak so the
mic captures you cleanly, then fades it back
- **Quiet-mic friendly** — boost-only auto-gain levels each take before transcription,
and a mic input volume slider lives right in Settings
- **Smart post-processing** — re-joins false sentence breaks, strips Whisper
hallucinations (“Thanks for watching”, `[MUSIC]`…), applies a custom coding vocabulary
- **In-app model manager** — download Whisper models (Fast / Balanced / Accurate) with resume
- **Configurable** — global hotkey, model, microphone, language, vocabulary, ducking, launch-at-login
## Requirements
- Apple Silicon Mac (M1 or later), macOS 14+
- ~150 MB + the model you choose (75 MB – 3.1 GB)
## Install
1. Download `Talkty.dmg` from the [latest release](https://github.com/v2matosevic/talkty-mac/releases/latest).
2. Open it and drag **Talkty** into **Applications**.
3. **First launch:** the app is self-signed (not yet notarized), so Gatekeeper
will block it the first time. Either **right-click → Open** and confirm, or run:
```bash
xattr -dr com.apple.quarantine /Applications/Talkty.app
```
4. Look for the **mic glyph in the menu bar**. Grant **Microphone** when prompted;
grant **Accessibility** only if you turn on type-at-cursor. The global hotkey
itself needs no special permission.
> Notarized builds (no Gatekeeper prompt) and a Homebrew cask are on the roadmap.
## Build from source
```bash
brew install cmake # one-time build dependency
Scripts/bootstrap.sh # vendor whisper.cpp (pinned) + build Metal static libs
Scripts/make_app.sh release # build + assemble + sign dist/Talkty.app
open dist/Talkty.app
```
For stable Microphone/Accessibility grants across rebuilds, run
`Scripts/dev_identity.sh` once (creates a local self-signed code-signing identity);
`make_app.sh` then signs with it automatically. `make_app.sh release --install`
also copies the build into `/Applications`. Open `Package.swift` in Xcode to edit.
## Models
| Tier | Model | Size | Notes |
|------|-------|------|-------|
| Fast | tiny.en / base.en | 75 / 142 MB | English, quick notes |
| Balanced | small.en | 466 MB | English everyday |
| Balanced | **large-v3-turbo** ★ | 1.6 GB | 99+ languages, the all-round pick |
| Accurate | medium.en | 1.5 GB | English, high accuracy |
| Accurate | large-v3 | 3.1 GB | 99+ languages, highest accuracy |
Models download from HuggingFace into `~/Library/Application Support/Talkty/Models/`.
### Optional: Neural Engine acceleration
Talkty is built with Core ML, so it can run Whisper's **encoder on the Apple Neural
Engine** (lower power than Metal-only). It's opt-in per model — generate the encoder once:
```bash
Scripts/make_coreml.sh base.en # or large-v3-turbo, large-v3, …
```
That pulls torch/coremltools via `uv` (no Xcode needed) and installs
`ggml--encoder.mlmodelc` next to the model. whisper picks it up automatically and
falls back to Metal when it's absent. The decoder always runs on Metal.
## Architecture
Swift 6 + SwiftUI over AppKit, whisper.cpp (Metal, embedded shader) as the only
engine. See `CLAUDE.md` for the developer guide, the subsystem map, and the
hard-won gotchas (clean-`_exit` teardown, first-load Metal compile, link order).
```
Sources/CWhisper C bridge to whisper.h/ggml.h
Sources/TalktyKit UI-agnostic logic (engine, audio, post-processing, services, models)
Sources/Talkty The app (AppKit/SwiftUI shell, overlay, windows, state machine)
Tests/TalktyTests Zero-dependency test harness (runs under Command Line Tools)
Scripts/ bootstrap, build_whisper, make_app, make_dmg, make_icon, dev_identity
```
## Contributing
Issues and PRs welcome. Tests run with `.build/debug/TalktyTests` after
`swift build` (no Xcode required — Command Line Tools + `cmake` is enough).
Keep `TalktyKit` UI-agnostic; the app layer owns all windows. See `CLAUDE.md`.
## License
[MIT](LICENSE) © 2026 Marko Matošević