{"id":51150974,"url":"https://github.com/chendrizzy/claude-tts","last_synced_at":"2026-06-26T06:01:04.327Z","repository":{"id":367450084,"uuid":"1280234693","full_name":"chendrizzy/claude-tts","owner":"chendrizzy","description":"Hear your coding agent. claude-tts speaks a curated, filtered stream of a Claude Code agent's work — the status pivots, the errors, the final answers — and stays quiet through the noise.","archived":false,"fork":false,"pushed_at":"2026-06-26T04:17:28.000Z","size":1616,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-26T04:19:20.249Z","etag":null,"topics":["accessibility","ai-agent","claude-code","claude-code-plugin","kokoro","local-llm","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"https://github.com/chendrizzy/claude-tts","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chendrizzy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"ko_fi":"O5O71VZXTJ"}},"created_at":"2026-06-25T11:47:24.000Z","updated_at":"2026-06-26T04:17:07.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/chendrizzy/claude-tts","commit_stats":null,"previous_names":["chendrizzy/claude-tts"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/chendrizzy/claude-tts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chendrizzy%2Fclaude-tts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chendrizzy%2Fclaude-tts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chendrizzy%2Fclaude-tts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chendrizzy%2Fclaude-tts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chendrizzy","download_url":"https://codeload.github.com/chendrizzy/claude-tts/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chendrizzy%2Fclaude-tts/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34805072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-26T02:00:06.560Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","ai-agent","claude-code","claude-code-plugin","kokoro","local-llm","text-to-speech","tts"],"created_at":"2026-06-26T06:01:00.914Z","updated_at":"2026-06-26T06:01:04.299Z","avatar_url":"https://github.com/chendrizzy.png","language":"Python","funding_links":["https://ko-fi.com/O5O71VZXTJ"],"categories":[],"sub_categories":[],"readme":"# claude-tts\n\n[![tests](https://github.com/chendrizzy/claude-tts/actions/workflows/test.yml/badge.svg)](https://github.com/chendrizzy/claude-tts/actions/workflows/test.yml)\n[![license: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![python: 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](pyproject.toml)\n[![local-first](https://img.shields.io/badge/local--first-token--free-success.svg)](#no-llm-fallback)\n\n**Hear your coding agent.** claude-tts speaks a *curated, filtered* stream of a\n[Claude Code](https://docs.anthropic.com/en/docs/claude-code) agent's work — the\nstatus pivots, the errors, the final answers — and stays quiet through the noise.\nA local LLM judges what's worth saying and summarizes the long bits; a TTS engine\nsynthesizes it; Claude Code hooks drive the whole thing. Local-first and\ntoken-free by default.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/media/demo.gif\" alt=\"claude-tts deciding, in real time, what to speak from a Claude Code session: Read/Edit stay quiet; test results, an error, and the final answer are spoken aloud.\" width=\"100%\"\u003e\n\u003c/p\u003e\n\n\u003e ▶ **[Watch the ~14s clip with sound](docs/media/demo.mp4)** · **[audio only](docs/media/sample.mp3)** — the exact lines marked SPEAK above, voiced by the default `edge-tts` engine.\n\nThe frame above isn't a mockup: it's `tests/fixtures/event_corpus.jsonl` replayed\nthrough the real classifier (`scripts/demo_gif.py`). The same corpus gates CI, so\nthe demo can't drift from what the daemon actually does.\n\n## What it says — and what it doesn't\n\nThe value isn't the voice, it's the *judgment*. Default verdict is **silence**;\nonly four kinds of event earn speech.\n\n| A Claude Code event… | Verdict | …becomes |\n|----------------------|---------|----------|\n| `Read`/`Edit`/`Write` succeeds | 🔇 quiet | — |\n| `git status`, a file listing, fenced code, a repeat | 🔇 quiet | — |\n| `pytest` → `23 passed, 4 failed in 12.3s` | 🔊 **status** | *\"In the tests: 23 passed, 4 failed.\"* |\n| a command writes to stderr | 🔊 **error** *(pre-empts)* | *\"cat: /nonexistent: No such file or directory\"* |\n| \"★ Insight: the timeout fires before the handshake…\" | 🔊 **insight** | *(spoken; long ones summarized)* |\n| the assistant's end-of-turn answer | 🔊 **final answer** | *\"Done. The bug was a missing await on the queue.put call.\"* |\n\nAnd it cleans markup *before* it speaks, so you never hear punctuation read out:\n\n| In the agent's raw output | Spoken |\n|---------------------------|--------|\n| `` Run `pytest -q` now `` | Run pytest -q now |\n| `Fixed the **race** in` `queue_manager.py` | Fixed the race in queue_manager.py |\n| `the value 2**8 equals 256` | the value 2\\*\\*8 equals 256 *(math kept, not \"bold\")* |\n| `checked out agent-a1b2c3d4e5 worktree` | checked out worktree *(hash dropped)* |\n| `## Summary` · `[the docs](https://…)` | Summary · the docs |\n\n\u003e Replaying 6,695 real spoken excerpts through the normalizer, markdown leaked\n\u003e into speech in **23.8%** of them before this cleaner and **0.0%** after — a\n\u003e figure asserted on every run by `make verify` (`tests/test_shadow_replay.py`).\n\n## How it works\n\n```\nClaude Code hooks ──▶ unix socket ──▶ daemon\n                                        │\n        ContentRouter (classify · judge · summarize)   ← the filter brain\n                                        │\n        QueueManager ▶ Orchestrator ▶ Generate ▶ Playback\n                          (LLM provider seam)  (TTS engine)  (OS audio)\n```\n\nThe filter brain decides **what** to surface and **how** to phrase it. Synthesis\nand playback are swappable behind three seams. The assistant's end-of-turn answer\nis classified as **prose**, so ordinary markdown in it (a symbol run, a\ndiff-stat- or commit-shaped line) doesn't trip the stdout-noise patterns that gate\nraw tool output and would otherwise veto the whole summary; whole-message shapes,\ndedup, and `system-reminder` blocks still drop. For the full picture — the\nclassification ladder, backpressure tiers, error pre-emption, the markdown→speech\nchokepoint — see **[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)**.\n\n- **LLM provider seam** (`daemon/providers/`) — `judge(text) → speak?` and\n  `summarize(text) → str`. Ships `ollama` (local default), `openai_compat`\n  (any OpenAI-compatible `base_url`: LM Studio, llama.cpp, vLLM, Groq, …), and\n  `null` (a deterministic, no-LLM floor — see below).\n- **TTS engine seam** (`daemon/engines/`, `daemon/pipeline/`) — `edge-tts`\n  (cross-platform Azure voices), `say`/`espeak` (zero-dependency system engine),\n  `kokoro` (local MLX, Apple Silicon), and `voicebox` (local app via REST).\n- **Platform seam** (`daemon/platforms/`) — macOS `afplay` + `launchd`; Linux\n  auto-detected player + `systemd --user`; Windows `ffplay` (run daemon manually).\n\n### No-LLM fallback\n\nWith `llm_provider.type = \"null\"`, the system still works on deterministic rules:\nit speaks structured signals (test counts, errors, status) and drops noise,\nsummarizing by truncation. The LLM is an *intelligence upgrade*, not a hard\ndependency.\n\n## Install\n\nclaude-tts is a Claude Code plugin. Once it's added as a marketplace source:\n\n```\n/plugin marketplace add chendrizzy/claude-tts\n/tts:setup\n```\n\n`/tts:setup` detects your platform, picks an engine and LLM backend, calibrates\nthe backend against a bundled mini-eval, installs the background service\n(`launchd` / `systemd --user`), and writes your config. Re-runnable and\nidempotent; `/tts:doctor` re-checks health anytime.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eManual setup\u003c/b\u003e (development, or running without the plugin)\u003c/summary\u003e\n\n```bash\ngit clone https://github.com/chendrizzy/claude-tts\ncd claude-tts\nuv sync --extra edge          # base deps + the edge-tts engine\ncp config.example.json ~/.config/claude-tts/config.json   # edit to taste\n```\n\nWire the hooks in `hooks/` into your Claude Code settings (the registry is\n`hooks/hooks.json`); the `SessionStart` hook launches the daemon automatically.\nVerify it's alive: the daemon binds the socket (see\n[Configuration](#configuration)) and a test utterance plays. The **kokoro**\nengine additionally needs an `mlx-audio` interpreter pointed to by `$MLX_PYTHON`\n(see [`.env.example`](.env.example)).\n\u003c/details\u003e\n\n## Commands\n\n| Command | What it does |\n|---------|--------------|\n| `/tts:setup` | First-run setup: engine, LLM backend, calibration, service, config |\n| `/tts:status` | Daemon socket, active engine + model, recent log tail (read-only) |\n| `/tts:log` | Replay what the daemon has actually spoken — newest first, with timestamps + category (read-only) |\n| `/tts:doctor` | Disk, daemon/socket, deps, backend reachability — PASS/WARN + fixes |\n| `/tts:voice` | Pick a speech engine and voice, then restart to apply |\n| `/tts:uninstall` | Stop and remove the service/daemon (optionally the config) |\n\n## Requirements\n\n- **Python ≥ 3.11** and [`uv`](https://docs.astral.sh/uv/).\n- **macOS** (`afplay` + `launchd`) or **Linux** (auto-detected audio player +\n  `systemd --user`). On Windows, run the daemon manually or use WSL2/Docker.\n- For the default **LLM provider**: a local [Ollama](https://ollama.com) with a\n  small model, e.g. `ollama pull qwen2.5-coder:1.5b` — or any OpenAI-compatible\n  server, or no LLM at all.\n- For an **engine**: `edge-tts` (the `edge` extra, needs internet) or — on\n  Apple Silicon — `kokoro` via a separate `mlx-audio` interpreter.\n\n## Configuration\n\nCopy [`config.example.json`](config.example.json) (every key is annotated inline)\nand edit. Every block is optional; the daemon embeds safe defaults. Common knobs:\n`voice.engine`, `llm_provider.type`, `summarizer.model`,\n`filtering.max_response_length`. Full reference, defaults, and environment\nvariables: **[docs/CONFIGURATION.md](docs/CONFIGURATION.md)**.\n\n## Spoken-output log \u0026 statusline\n\nEvery utterance the daemon speaks is appended to a bounded, per-session JSONL at\n`~/.claude/logs/tts/spoken/\u003csession\u003e.jsonl` (`daemon/spoken_log.py`, capped at the\nmost recent 500 lines). It's best-effort — a logging failure never silences\nspeech. **`/tts:log`** prints it newest-first with timestamps and category (pass\na count, or `--session \u003cid\u003e` to target one session).\n\n**Sub-agents and background agents** each get their own `session_id` from Claude\nCode, so each writes its own spoken-log file. With\n`statusline.include_subagent_in_main` enabled, `/tts:log` (no `--session`) merges\nsibling-agent lines that overlap this session's time span **and ran in the same\nproject (cwd)**, each tagged by source. Sub-agents and background agents inherit\ntheir parent's cwd, so they fold in; an *unrelated* session running concurrently\nfrom another directory is excluded (each spoken-log entry now records its cwd).\n\n**Statusline 🔊 segment.** The repo ships the spoken-log *data* plus a\n`statusline` config block — but **not** the rendering wrapper that draws the 🔊\nsegment (that's a personal/global statusline file, outside this repo). To compose\nyour own, read the latest line with `spoken_log.read_latest()` (or tail the\nJSONL) and append it to your existing statusline. The config keys (in\n[`config.example.json`](config.example.json)) are **view-only** — the daemon\nalways logs raw per-session truth; they only change what the statusline and\n`/tts:log` *show*:\n\n| Key | Default | Effect |\n|-----|---------|--------|\n| `statusline.subagent_aware` | `true` | let the statusline follow whichever agent spoke most recently **in the same project (cwd)** — a live sub-agent / background-agent surfaces (marked 🔊⤷). Scoped by cwd, so an unrelated concurrent session in another directory is never followed. `false` → strictly this session's own output. |\n| `statusline.active_window_s` | `90` | how recently (seconds) a same-cwd agent must have spoken for the line to pivot to it. |\n| `statusline.include_subagent_in_main` | `false` | merge sibling-agent lines into `/tts:log`'s default view — **same-cwd siblings only**, so it's safe to enable even with multiple projects open. |\n\n\u003e **Note (v0.1.5):** sub-agent *following* is **cwd-scoped**. Each spoken-log\n\u003e entry records the project dir it was spoken in; sub-agents and background\n\u003e agents inherit their parent's cwd, so they're followed/merged, while an\n\u003e unrelated session in another directory never is. (The 🔊 segment itself is\n\u003e drawn by your own statusline wrapper — see above — which reads the persisted\n\u003e `cwd` to scope the pivot.)\n\n**Low disk no longer mutes silently.** Before each synthesis write, the daemon\nchecks free space against a fixed internal threshold (~200 MB, not config-tunable);\nbelow it, it evicts cache and — if still low — refuses to synthesize and fires a\n*loud* alert: a desktop notification (`osascript` / `notify-send`) plus a\n`disk_full` spoken-log entry that surfaces on the statusline, instead of failing\nthe write quietly.\n\n## Cursor (editor) support\n\nclaude-tts can also speak for [Cursor](https://cursor.com)'s agent. The repo\nships wrapper hooks that translate Cursor's agent-hook events into the same\ndaemon the Claude Code hooks drive:\n\n| Cursor hook event | Wrapper script |\n|-------------------|----------------|\n| `preToolUse` | `hooks/cursor-pre-tool-use.sh` |\n| `postToolUse` | `hooks/cursor-post-tool-use.sh` |\n| `afterAgentResponse` | `hooks/cursor-after-agent-response.sh` |\n\n`hooks/cursor_normalize.py` maps Cursor's field names (`conversation_id` →\n`session_id`, `tool_output` → `tool_response`, `Shell` → `Bash`, …) into the\nClaude Code hook shape; each wrapper then delegates to the matching Claude Code\nhook with `CLAUDE_TTS_PASSTHROUGH=false`, so Cursor's tool stdout stays clean\n(that gate in `hooks/pre-tool-use.sh` / `post-tool-use.sh` defaults to `true` for\nClaude Code, which chains hooks on stdout).\n\nThese wrappers are **not** registered in `hooks/hooks.json` — Cursor wiring is\nmanual. Point Cursor's hook config at the absolute path of each wrapper, in\n`\u003cproject\u003e/.cursor/hooks.json` (one project) or `~/.cursor/hooks.json` (all\nprojects):\n\n```json\n{\n  \"version\": 1,\n  \"hooks\": {\n    \"preToolUse\": [{ \"command\": \"/abs/path/to/claude-tts/hooks/cursor-pre-tool-use.sh\" }],\n    \"postToolUse\": [{ \"command\": \"/abs/path/to/claude-tts/hooks/cursor-post-tool-use.sh\" }],\n    \"afterAgentResponse\": [{ \"command\": \"/abs/path/to/claude-tts/hooks/cursor-after-agent-response.sh\" }]\n  }\n}\n```\n\n- **macOS:** the after-response wrapper uses GNU `timeout` — install it with\n  `brew install coreutils`.\n- **Python:** `PYTHON_BIN` defaults to a macOS framework `python3`; override it\n  with `CLAUDE_TTS_PYTHON`, and it falls back to `command -v python3`.\n\nEnvironment variables read by the wrappers:\n\n| Var | Default | Effect |\n|-----|---------|--------|\n| `CLAUDE_TTS_ENABLED` | `true` | set to anything else to mute the wrappers |\n| `CLAUDE_TTS_PYTHON` | framework `python3` | interpreter used for normalization + dispatch |\n| `CLAUDE_TTS_PASSTHROUGH` | `true` (wrappers force `false`) | when `false`, the underlying hook does not echo stdin, keeping Cursor's tool stdout clean |\n\n## Project layout\n\n```\ndaemon/              the daemon — socket server, router, async pipeline, seams\n  content_router.py    the filter brain: classify → judge → summarize\n  pipeline/            ingest → process → generate → playback\n  providers/           LLM seam: ollama · openai_compat · null\n  engines/             TTS seam: edge-tts · system (say/espeak)\n  platforms/           OS seam: launchd · systemd · audio players\n  spoken_log.py        bounded per-session JSONL of what was actually spoken\nhooks/               Claude Code hook scripts + hooks.json registry (+ cursor-* wrappers)\ncommands/            slash commands (/tts:setup, status, doctor, voice, log, uninstall)\nskills/tts-setup/    the first-run setup procedure\nscripts/             calibration, manifest sync, the demo generators\ntests/               the `make verify` gate + fixtures (spoken \u0026 event corpora)\ndocs/                ARCHITECTURE · CONFIGURATION · PUBLISH\n```\n\n## Development\n\n`make verify` is the **binding quality gate** — a deterministic, all-sync suite\n(no live daemon, Ollama, or `pytest-asyncio` needed) that fails on markdown\nleaking to speech, classification regressions, and path-humanization bugs.\n\n```bash\nuv sync --extra edge --extra dev\nuv run make verify\n```\n\nCI runs that same gate across macOS + Linux × Python 3.11–3.13; a separate,\ninformational `async-suite` job (`continue-on-error`, Linux + Python 3.12)\nadditionally exercises the async pipeline/playback/queue tests under\n`pytest-asyncio` so async regressions stay visible.\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for the test layout and how to add a\nprovider or engine, and [docs/PUBLISH.md](docs/PUBLISH.md) for the release\nprocess. The demo media is regenerated with\n`uv run --with pillow python scripts/demo_gif.py` (and `demo_audio.py`).\n\n## License\n\n[MIT](LICENSE) © chendrizzy\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchendrizzy%2Fclaude-tts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchendrizzy%2Fclaude-tts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchendrizzy%2Fclaude-tts/lists"}