https://github.com/cursorvoice/cursor-voice
Talk to your Mac — a native macOS voice assistant that sees your screen, drives your Mac, and answers back. Powered by the OpenAI Realtime API. Open source.
https://github.com/cursorvoice/cursor-voice
accessibility ai macos macos-app menubar-app openai realtime-api swift voice-assistant voice-control
Last synced: 5 days ago
JSON representation
Talk to your Mac — a native macOS voice assistant that sees your screen, drives your Mac, and answers back. Powered by the OpenAI Realtime API. Open source.
- Host: GitHub
- URL: https://github.com/cursorvoice/cursor-voice
- Owner: cursorvoice
- License: mit
- Created: 2026-06-01T16:11:16.000Z (23 days ago)
- Default Branch: main
- Last Pushed: 2026-06-10T13:13:12.000Z (15 days ago)
- Last Synced: 2026-06-10T13:22:06.534Z (15 days ago)
- Topics: accessibility, ai, macos, macos-app, menubar-app, openai, realtime-api, swift, voice-assistant, voice-control
- Language: Swift
- Homepage: https://cursorvoice.app
- Size: 1.84 MB
- Stars: 12
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
- awesome-swift-macos-apps - Cursor Voice - voice?style=flat&label=" /> <img align="bottom" height="13" src="https://img.shields.io/github/last-commit/cursorvoice/cursor-voice?style=flat&label=" /> - Push-to-talk voice assistant that can see your screen, control apps, and respond in real time. (Audio)
README
# Cursor Voice
**→ [cursorvoice.app](https://cursorvoice.app)**
A native macOS voice assistant that lives next to your cursor. Press a hotkey, talk to it, and it sees your screen, drives your Mac, and answers back — powered by the OpenAI Realtime API.
```
cursor
┃
▼ ← ⌃⌥/ summon
◯ listening…
◯ "search youtube for lo-fi beats"
◯ opening a link…
```
## What it does
- **Voice-in / voice-out** via `gpt-realtime` (configurable per-session)
- **Sees your screen** — captures the display via ScreenCaptureKit so it can answer about what's in front of you
- **Drives the Mac** — synthesizes mouse and keyboard input via CGEvent; clicks UI elements by name via the Accessibility tree; runs AppleScript and shell commands
- **Web access** — `web_search` (no API key) + `fetch_url` for live information
- **Persistent memory** — remembers facts across sessions
- **Wake word** — opt-in, on-device, listens for "Hey Cursor" via SFSpeechRecognizer
## Install
### One-line installer (curl)
```bash
curl -fsSL https://raw.githubusercontent.com/cursorvoice/cursor-voice/main/install.sh | bash
```
Downloads the latest release, copies the app into `/Applications`, strips the quarantine attribute, and launches it. Done.
### Homebrew
```bash
# Install
brew tap cursorvoice/cursor-voice
brew install --cask cursor-voice
# Update to the latest release
brew upgrade --cask cursor-voice
```
The cask's `postflight` strips quarantine automatically — no right-click-Open dance.
### Manual
1. Download the DMG from the [latest release](https://github.com/cursorvoice/cursor-voice/releases/latest).
2. Open it, drag **Cursor Voice** into Applications.
3. **First launch**: macOS Gatekeeper will refuse to open it (it's self-signed, not notarized — no paid Developer ID). Fix with one of:
- **Right-click the app → Open** → confirm in the dialog.
- Or: `xattr -dr com.apple.quarantine /Applications/CursorVoice.app && open /Applications/CursorVoice.app`
You'll see a small aurora orb appear in the menu bar.
## Setup
1. Click the menu bar orb → **Settings…**
2. Paste your **OpenAI API key** (stored in your Keychain).
3. Pick a **hotkey** (default `⌃⌥/`) and optionally enable the wake word.
4. Macros will prompt for **Microphone**, **Speech Recognition**, **Screen Recording**, and **Accessibility** permissions the first time you press the hotkey. Grant all four — each one unlocks one capability.
> **Important**: macOS only honors Screen Recording and Accessibility on a fresh process launch, so quit and reopen the app after granting them.
## Use
- **Press the hotkey** anywhere — the orb materializes at your cursor.
- **Speak** — it streams audio to the realtime model and speaks back.
- **Interrupt** by talking over it; it stops mid-sentence cleanly.
- **Click outside the orb / press Esc** to dismiss.
- **Wake word** (opt-in): say *"Hey Cursor"* anywhere.
Example commands:
- "What's on my screen?"
- "Search YouTube for lo-fi beats"
- "Open the Downloads folder"
- "Play Bohemian Rhapsody in Apple Music"
- "Click the Save button"
- "Remember that my main project lives in `~/Code/foo`"
## Models
Pick a Realtime model in **Settings → Advanced**:
| Model | Notes |
| ------------------------ | --------------------------------------- |
| `gpt-realtime` | Default GA model |
| `gpt-realtime-2` | Reasoning, slower, most capable |
| `gpt-realtime-1.5` | Best voice quality |
| `gpt-realtime-mini` | Cheap & fast |
| `gpt-realtime-translate` | Real-time speech-to-speech translation |
Changes apply to the open session immediately (it reconnects).
## Permissions
Cursor Voice asks for, in order:
1. **Microphone** — to capture your voice
2. **Speech Recognition** — only if wake word is enabled (on-device)
3. **Screen Recording** — for `see_screen` and the auto-attached screenshots after each action
4. **Accessibility** — for `click_element` (AX-tree clicking) and mouse/keyboard synthesis
You can see live status in **Settings → Permissions** with deep-link buttons to the relevant System Settings pane.
## Privacy
- API key lives only in your local Keychain.
- Audio is streamed to OpenAI's Realtime API while the orb is active. No audio leaves your Mac when the orb is dismissed.
- Wake word listening is **on-device** — audio is not transmitted unless the phrase matches.
- Memory is stored locally at `~/Library/Application Support/CursorVoice/memory.json`.
## Build from source
Requirements: macOS 14+, Command Line Tools (`xcode-select --install`).
```bash
git clone https://github.com/cursorvoice/cursor-voice.git
cd cursor-voice
./scripts/build.sh
./scripts/dmg.sh
open ./build/CursorVoice.app
```
`build.sh` compiles with `swiftc`, assembles the bundle, generates the `.icns`, writes `Info.plist`, and ad-hoc-signs with the hardened runtime. `dmg.sh` packs the bundle into a drag-to-Applications DMG.
There's no Xcode project required — the codebase is plain Swift sources organised under `Sources/CursorVoice/`.
## Architecture
- `App.swift` / `AppCoordinator.swift` — entry point, lifecycle, orchestrates everything
- `MenuBarExtra` — SwiftUI menu bar item with `SettingsLink`
- `Orb/` — borderless `NSPanel` floating at the cursor; the aurora SwiftUI view with reveal/breath/audio-reactive animations; cursor halo overlay
- `Realtime/RealtimeClient.swift` — `URLSessionWebSocketTask` against `wss://api.openai.com/v1/realtime`; barge-in interruption with `response.cancel` + `conversation.item.truncate`
- `Realtime/AudioEngine.swift` — `AVAudioEngine` capture at 24kHz PCM16, playback via `AVAudioPlayerNode`
- `Realtime/ToolHandler.swift` — dispatch for the tool calls
- `Capabilities/` — `ScreenCapture` (ScreenCaptureKit), `InputSynth` (CGEvent mouse/keyboard), `AXTree` (Accessibility tree introspection), `WebSearch`, `AppleScriptRunner`, `ShellRunner`, `MemoryStore`
- `Hotkey/` — Carbon `RegisterEventHotKey`
- `WakeWord/` — `SFSpeechRecognizer` continuous recognition
- `Settings/` — SwiftUI Settings scene + Keychain
## Caveats
- **App Sandbox is off.** The shell + AppleScript tools and CGEvent posting need this. If you re-enable the sandbox, drop those capabilities.
- **Ad-hoc signature.** No paid Developer ID, so no notarization. Gatekeeper will block on first launch — see install instructions above.
- **Apple Silicon only.** Built for `arm64-apple-macos14.0`.
## Support
Cursor Voice is free and open source.
- **Questions or bug reports:** open a [GitHub issue](https://github.com/cursorvoice/cursor-voice/issues) or email **support@cursorvoice.app**.
- **Sponsorship or partnership:** **sponsor@cursorvoice.app** (or the **Sponsor** button at the top of this repo). Keeping the project online costs a little each year — any support is genuinely appreciated.
Thank you 💜
## License
MIT.