{"id":51148973,"url":"https://github.com/elien666/diarize","last_synced_at":"2026-06-26T04:30:59.232Z","repository":{"id":358910303,"uuid":"1242173770","full_name":"elien666/diarize","owner":"elien666","description":"On-device speaker diarization and transcription for macOS — CLI, SwiftUI app, and Swift library powered by FluidAudio and GRDB.","archived":false,"fork":false,"pushed_at":"2026-06-15T13:37:46.000Z","size":1389,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-15T14:16:33.805Z","etag":null,"topics":["audio","cli","coreml","fluidaudio","grdb","macos","speaker-diarization","swift","swiftui","transcription"],"latest_commit_sha":null,"homepage":null,"language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elien666.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-18T07:31:29.000Z","updated_at":"2026-06-15T13:37:42.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/elien666/diarize","commit_stats":null,"previous_names":["elien666/diarize"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/elien666/diarize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elien666%2Fdiarize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elien666%2Fdiarize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elien666%2Fdiarize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elien666%2Fdiarize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elien666","download_url":"https://codeload.github.com/elien666/diarize/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elien666%2Fdiarize/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34803678,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-26T02:00:06.560Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","cli","coreml","fluidaudio","grdb","macos","speaker-diarization","swift","swiftui","transcription"],"created_at":"2026-06-26T04:30:59.124Z","updated_at":"2026-06-26T04:30:59.217Z","avatar_url":"https://github.com/elien666.png","language":"Swift","funding_links":[],"categories":[],"sub_categories":[],"readme":"# diarize\n\n**On-device speaker diarization and transcription for macOS — CLI, SwiftUI app, and Swift library.**\n\n`diarize` records audio (microphone, system audio, or both), splits it by speaker, transcribes each segment, and matches voices across recordings so the same person keeps the same identity over time. Everything runs locally on Apple Silicon — no cloud, no API keys.\n\nBuilt on [FluidAudio](https://github.com/FluidInference/FluidAudio) (diarization + ASR via Core ML), [GRDB](https://github.com/groue/GRDB.swift) (SQLite with FTS5 full-text search), and Swift 6.\n\n---\n\n## Features\n\n- **Record \u0026 transcribe in one step** — capture mic + system audio simultaneously (great for meetings), auto-transcribe on stop. → [docs](docs/recording.md)\n- **Stereo channel separation** — when recording mic + system audio, each goes on its own channel (mic = left, system = right) and is diarized independently, so speaker echo never collapses everyone into one voice. → [docs](docs/recording.md#mic--system-audio-together-stereo-separation)\n- **Auto Recording Mode** — detects when a call starts (another app grabs the mic) and records it hands-free, stopping and transcribing on its own. → [docs](docs/auto-recording.md)\n- **Cross-recording speaker matching** — voice embeddings are stored once; the same person is recognized in every future recording. → [docs](docs/transcripts-and-speakers.md#how-speakers-are-recognized)\n- **Manual speaker correction** — rename speakers globally, reassign or split segments, and merge duplicate identities when the diarizer guesses wrong. → [docs](docs/transcripts-and-speakers.md#correcting-speakers)\n- **Synced playback** — play the audio and watch the transcript highlight and auto-scroll; click any timestamp to jump. → [docs](docs/transcripts-and-speakers.md#reading-a-transcript)\n- **Live recording feedback** — per-device level meters, mic selection, and automatic recovery if the input device changes mid-recording. → [docs](docs/recording.md#live-level-meters)\n- **Full-text search** — SQLite FTS5 across every transcript, with snippets and ranking. → [docs](docs/search.md)\n- **Folders \u0026 organization** — group recordings into nested folders with drag-and-drop and inline rename. → [docs](docs/organizing.md)\n- **Privacy-first** — fully on-device; delete raw audio while keeping transcripts (GDPR-friendly), with optional auto-clean of old audio and a menu-bar stealth mode. → [docs](docs/privacy.md)\n- **MCP server for agents** — expose the library to local AI agents over [Model Context Protocol](https://modelcontextprotocol.io): read recordings/speakers, find unprocessed work, mark recordings processed, retry failed analyses, manage titles and folders, and assess + correct diarization quality (reassign mis-attributed segments, name/merge speakers, split turns) — all on-device. → [docs](docs/mcp.md)\n- **Markdown + JSON output** — transcripts are written as readable Markdown and queryable JSON.\n- **Local archive** — recordings, transcripts, and the speaker database live under `~/Library/Application Support/diarize/` (configurable).\n- **Two front-ends + an agent interface** — a scriptable CLI (`diarize`) and a native SwiftUI app (`diarize-app`), plus an MCP server, all backed by the same `DiarizeCore` library.\n\n📖 **New here?** Start with the [User Guide](docs/README.md).\n\n## Requirements\n\n- macOS 14 (Sonoma) or newer\n- Apple Silicon (M1+) recommended — Core ML models run on the Neural Engine\n- Swift 6 / Xcode 16\n- Microphone permission (for `record`); Screen Recording permission (for system-audio capture)\n\n## Install\n\n```sh\ngit clone https://github.com/elien666/diarize.git\ncd diarize\nswift build -c release\ncp .build/release/diarize /usr/local/bin/   # or anywhere on $PATH\n```\n\nTo build the SwiftUI app:\n\n```sh\n./Scripts/build-app.sh\nopen build/Diarize.app\n```\n\n## CLI quick start\n\n```sh\n# Transcribe an existing audio file (mp3, wav, m4a, …)\ndiarize transcribe meeting.m4a --lang en --title \"Q2 planning\"\n\n# Record mic + system audio, auto-transcribe on stop (Ctrl-C)\ndiarize record --title \"1:1 with Sam\"\n\n# Search across every transcript\ndiarize search \"roadmap\"\n\n# Manage the speaker library\ndiarize speakers list\ndiarize speakers label spk_a1b2c3 \"Sam\"\ndiarize speakers merge spk_a1b2c3 spk_d4e5f6\n\n# Inspect or reprocess the archive\ndiarize archive list\ndiarize archive reprocess \u003crecording-id\u003e\n\n# Show / change config\ndiarize config show\ndiarize config set default.language en\n\n# Serve the library to local AI agents (Model Context Protocol)\ndiarize mcp\n```\n\nAll commands accept `--help` for full options. Full command reference: [docs/cli.md](docs/cli.md).\n\n## Documentation\n\nUser-facing guides live in [`docs/`](docs/README.md):\n\n| Guide | What it covers |\n| --- | --- |\n| [Getting Started](docs/getting-started.md) | Install, permissions, first recording |\n| [Recording](docs/recording.md) | Sources, mic selection, level meters, stereo separation |\n| [Auto Recording Mode](docs/auto-recording.md) | Hands-free call capture |\n| [Transcripts \u0026 Speakers](docs/transcripts-and-speakers.md) | Reading transcripts and correcting speakers |\n| [Organizing Recordings](docs/organizing.md) | Folders, drag-and-drop, renaming |\n| [Search](docs/search.md) | Full-text search across transcripts |\n| [Privacy \u0026 Data](docs/privacy.md) | On-device processing, audio deletion, stealth mode |\n| [Settings](docs/settings.md) | Language, matching threshold, archive, maintenance |\n| [CLI Reference](docs/cli.md) | Every `diarize` command and option |\n| [MCP Server](docs/mcp.md) | Expose the library to local AI agents (tools, setup, safety) |\n\n## Configuration\n\nResolution order (highest wins): **CLI flag → env var → `~/.config/diarize/config.json` → default**.\n\n| Key                    | Env var                          | Default                                            |\n| ---------------------- | -------------------------------- | -------------------------------------------------- |\n| `archive.path`         | `DIARIZE_ARCHIVE_PATH`           | `~/Library/Application Support/diarize/archive`    |\n| `default.language`     | `DIARIZE_LANG_DEFAULT`           | `auto` (also: `de`, `en`)                          |\n| `similarity.threshold` | `DIARIZE_SIMILARITY_THRESHOLD`   | `0.6` (cosine similarity for speaker matching)     |\n\n## Project layout\n\n```\nSources/\n  DiarizeCore/    Library: audio I/O, diarization, ASR, storage, search\n    Audio/        Recorder, mixer, loader, WAV writer\n    Pipeline/     Diarization, transcription, speaker matching, calibration\n    Storage/      GRDB models, migrations, speaker store\n    Render/       Markdown + JSON renderers\n    MCP/          Model Context Protocol server (tools, resources) for AI agents\n  DiarizeCLI/     `diarize` executable (ArgumentParser)\n  DiarizeApp/     `diarize-app` SwiftUI app (sidebar/folders, recording detail,\n                  search, auto-recording mode, permissions, privacy cleanup, menu bar)\nResources/icon/   App icon (SVG + .icns)\nScripts/          Build helpers (app bundle, icon, code signing)\nTests/            DiarizeCore unit tests\n```\n\n## How it works\n\n1. **Capture** — `AudioRecorder` taps the microphone via `AVAudioEngine` and system audio via a `ScreenCaptureKit` / CoreAudio process tap; `AudioMixer` writes a WAV. With both sources active it writes **stereo** (mic = left, system = right) so the two can be diarized in isolation; a single source is written mono.\n2. **Diarize** — FluidAudio segments the waveform by speaker and emits an embedding per segment. For stereo recordings each channel is diarized independently and merged with `local` / `remote` prefixes, avoiding echo-induced speaker confusion.\n3. **Match** — `SpeakerMatcher` compares each new embedding against the SQLite speaker library (cosine similarity ≥ threshold) and either reuses an existing speaker ID or mints a new one.\n4. **Transcribe** — each segment is fed to FluidAudio's ASR model in the chosen language.\n5. **Persist** — `SpeakerStore` writes recording, segments, and transcript text into SQLite (with FTS5); Markdown + JSON renderers produce human-readable artifacts under the archive.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n## Acknowledgements\n\n- [FluidAudio](https://github.com/FluidInference/FluidAudio) — Core ML diarization and ASR\n- [GRDB.swift](https://github.com/groue/GRDB.swift) — SQLite toolkit\n- [swift-argument-parser](https://github.com/apple/swift-argument-parser) — CLI\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felien666%2Fdiarize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felien666%2Fdiarize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felien666%2Fdiarize/lists"}