https://github.com/arach/vox
Local-first macOS transcription runtime with Swift services, a Bun CLI, and a TypeScript SDK.
https://github.com/arach/vox
bun developer-tools local-first macos speech-to-text swift transcription typescript
Last synced: 14 days ago
JSON representation
Local-first macOS transcription runtime with Swift services, a Bun CLI, and a TypeScript SDK.
- Host: GitHub
- URL: https://github.com/arach/vox
- Owner: arach
- Created: 2026-03-17T02:22:00.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-06-03T19:17:09.000Z (16 days ago)
- Last Synced: 2026-06-03T20:10:18.117Z (16 days ago)
- Topics: bun, developer-tools, local-first, macos, speech-to-text, swift, transcription, typescript
- Language: Swift
- Homepage: https://voxd.cc/
- Size: 36 MB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Vox
Vox is a local-first voice stack for Apple platforms. This repo brings together the Hudson-powered menu bar app, the Swift runtime, the companion daemon, the TypeScript clients, and the CLI.
- `VoxCore`, `VoxEngine`, `VoxService`, and `VoxBridge`: embeddable Swift packages for macOS and iOS apps.
- `voxd`: Vox Companion, the Swift daemon for web-facing, shared-process, and operator integrations.
- `@voxd/sdk`: TypeScript SDK. Typed JSON-RPC client for Vox Companion integrations.
- `@voxd/client`: Browser SDK. HTTP bridge client for local web integrations.
- `vox`: Node CLI. Health checks, benchmarks, warm-up scheduling, dashboards.
- `Vox.app`: Hudson-based macOS menu bar app, packaged as a signed and notarized DMG by release automation.
Apple apps can embed Vox directly. Bun and Node tools can connect to `voxd` over local WebSocket JSON-RPC. Browser clients can connect through the companion HTTP bridge with `@voxd/client`.
## Pick a surface
- Build a native Apple app: start with `VoxCore`, `VoxEngine`, `VoxService`, and `VoxBridge`.
- Build a local tool or companion-connected service: start with `voxd` plus `@voxd/sdk`.
- Build a browser integration: start with `@voxd/client` plus the local bridge path.
- Operate or benchmark Vox itself: start with the `vox` CLI.
## Get Going Fast
Requirements:
- macOS 14+ for the Swift transcription packages; macOS 26+ for the Hudson menu app and demo flows that use Apple Intelligence
- Bun 1.2+
- Node 22+
- Swift 6.2+
- `uv` for the MLX-backed demo paths
### a. dev
To work on Vox:
```bash
git clone https://github.com/arach/vox.git
cd vox
bun install
bun run build
bun run test
```
Useful loops:
```bash
bun run site:dev
bun run docs:build
swift build --package-path swift
```
### b. client
To use Vox Companion locally from the CLI or SDK:
```bash
git clone https://github.com/arach/vox.git
cd vox
bun install
bun run build
```
Start the daemon, verify health, then transcribe:
```bash
node packages/cli/dist/index.js daemon start
node packages/cli/dist/index.js doctor
node packages/cli/dist/index.js models preload parakeet:v3
node packages/cli/dist/index.js transcribe file /path/to/audio.wav
```
If you are writing a local client, start here:
- `packages/client/` for the TypeScript SDK
- `packages/web-client/` for the browser SDK
- `swift/` for direct Apple embed mode
### c. demo
To run the standalone macOS demo app:
```bash
git clone https://github.com/arach/vox.git
cd vox
bun install
swift run --package-path examples/macos-minimal VoxMinimalExample
```
What the demo does:
- records locally from the mic
- transcribes with Parakeet
- replies with Apple Intelligence when available
- falls back to local Qwen 0.6B when Apple Intelligence is not ready
- speaks back with Kokoro through the MLX audio provider
Notes:
- The first run may download Parakeet, Kokoro, or the Qwen fallback model.
- The app will ask for microphone access.
- Apple Intelligence is optional for the demo, not required.
The current standalone demo lives in `examples/macos-minimal/` and is a good reference app for direct embed mode.
## Layout
- `swift/` contains `VoxCore`, `VoxEngine`, `VoxService`, and the `voxd` daemon.
- `packages/client` contains the TypeScript SDK for talking to `voxd` over local WebSocket JSON-RPC.
- `packages/web-client` contains the browser SDK for talking to the local HTTP bridge.
- `packages/cli` contains the `vox` CLI.
- `docs/` contains Dewey source docs.
- `site/` contains the marketing site, docs route, and OG generation.
## Commands
```bash
bun install
bun run dev
bun run build
bun run build:all
bun run test
bun run test:e2e
bun run site:build
bun run site:og
bun run docs:generate
```
## Telemetry
Each transcription or synthesis request appends a tagged sample to `~/.vox/performance.jsonl` with `clientId`, `route`, `modelId`, and `voiceId` when applicable.
This lets you answer a few practical questions: is the hot model fast, which integration is regressing, and whether latency is in inference, audio prep, or cold runtime work.
### CLI
```bash
vox daemon start
vox doctor
vox models list
vox models install
vox warmup start
vox warmup schedule 500
vox logs daemon --tail 80
vox transcribe status
vox transcribe cancel
vox transcribe file --timestamps /path/to/audio.wav
vox transcribe bench /path/to/audio.wav 5
vox perf dashboard
vox transcribe live --timestamps
```
## Companion Runtime
- Runtime discovery: `~/.vox/runtime.json`
- Latency samples: `~/.vox/performance.jsonl`
- Daemon logs: `~/.vox/logs/voxd.log` (written even when `voxd` is auto-started by the CLI)
`voxd` is Vox Companion: the shared-process and web-facing transport for Vox. `bun run test:e2e` is an opt-in macOS integration suite that boots `voxd`, preloads the model, synthesizes speech with `say`, and checks `transcribe file` output against keyword expectations.
Speech synthesis supports Apple system voices locally and OpenAI TTS (`gpt-4o-mini-tts`, `tts-1`, and `tts-1-hd`) when `OPENAI_API_KEY` is configured. The packaged app bundles `voxd` and `voxttsd` so the menu bar app and CLI share the same companion runtime.
## Docs and site
- Dewey source docs: `docs/`
- Generated handoff files: `AGENTS.md`, `llms.txt`, `docs.json`, `install.md`
- Website and `/docs` route: `site/`
- OG image template: `site/og-template.html`
## Release automation
- GitHub Pages deploys from `.github/workflows/deploy-pages.yml` to `https://voxd.cc`
- npm publishing runs from `.github/workflows/publish-packages.yml` on `v*` tags and publishes `@voxd/sdk`, `@voxd/client`, and `@voxd/cli`
- DMG builds run from `.github/workflows/release-dmg.yml`. Tag builds create or update the matching GitHub Release and upload `Vox.dmg`.
- Manual DMG releases can be started from the workflow page with a version. The workflow validates the repo versions, builds the DMG, signs and notarizes it, creates `v` when needed, creates the GitHub Release, and uploads the installer.
Required release secrets:
- `DEVELOPER_ID_APPLICATION_CERT_BASE64`
- `DEVELOPER_ID_APPLICATION_CERT_PASSWORD`
- `KEYCHAIN_PASSWORD`
- `APP_STORE_CONNECT_API_KEY_P8`
- `APP_STORE_CONNECT_KEY_ID` and `APP_STORE_CONNECT_ISSUER_ID` as repository variables, or secrets if needed
- `HUDSON_READ_TOKEN` for the private SwiftPM dependency used by the macOS app bundle
- `NPM_TOKEN` in the `PRODUCTION` environment for package publishing
The DMG workflow also accepts the older Apple ID notarization secrets (`APPLE_ID`, `APPLE_APP_PASSWORD`, and `APPLE_TEAM_ID`) as a fallback, plus the legacy certificate names (`APPLE_SIGNING_CERT_BASE64` and `APPLE_SIGNING_CERT_PASSWORD`).