{"id":31581019,"url":"https://github.com/nickpending/clarvis","last_synced_at":"2025-10-05T21:52:10.310Z","repository":{"id":315107169,"uuid":"1058124166","full_name":"nickpending/clarvis","owner":"nickpending","description":"Jarvis-style voice notifications for Claude Code that transforms AI assistant messages into spoken updates through configurable TTS integration.","archived":false,"fork":false,"pushed_at":"2025-09-16T18:18:45.000Z","size":69,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-16T19:58:29.620Z","etag":null,"topics":["claude-code","elevenlabs","tts","typescript"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nickpending.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-16T16:48:30.000Z","updated_at":"2025-09-16T18:18:48.000Z","dependencies_parsed_at":"2025-09-16T19:58:31.461Z","dependency_job_id":"a996509c-5584-42aa-b1e2-5f2610ca997e","html_url":"https://github.com/nickpending/clarvis","commit_stats":null,"previous_names":["nickpending/clarvis"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/nickpending/clarvis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nickpending%2Fclarvis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nickpending%2Fclarvis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nickpending%2Fclarvis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nickpending%2Fclarvis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nickpending","download_url":"https://codeload.github.com/nickpending/clarvis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nickpending%2Fclarvis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278526242,"owners_count":26001325,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-05T02:00:06.059Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["claude-code","elevenlabs","tts","typescript"],"created_at":"2025-10-05T21:52:09.104Z","updated_at":"2025-10-05T21:52:10.301Z","avatar_url":"https://github.com/nickpending.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# clarvis\n\n\u003cdiv align=\"center\"\u003e\n  \n  **Jarvis-style voice notifications for Claude Code**\n  \n  [GitHub](https://github.com/nickpending/clarvis) | [Issues](https://github.com/nickpending/clarvis/issues) | [lspeak](https://github.com/nickpending/lspeak)\n\n  [![Status](https://img.shields.io/badge/Status-Alpha-orange?style=flat)](#status-alpha)\n  [![TypeScript](https://img.shields.io/badge/TypeScript-5.8+-3178C6?style=flat\u0026logo=typescript)](https://typescriptlang.org)\n  [![Bun](https://img.shields.io/badge/Bun-1.2+-FBDB78?style=flat\u0026logo=bun)](https://bun.sh)\n  [![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n\n\u003c/div\u003e\n\n---\n\n**clarvis** gives Claude Code a voice. When your AI pair programmer completes tasks, encounters errors, or needs your attention, you'll hear Jarvis-style notifications through your speakers.\n\n```bash\n# Claude Code finishes implementing authentication\n\"Sir, I have completed project auth. JWT implementation successful.\"\n\n# Claude encounters an error\n\"Sir, I have encountered an error with project checkout-flow. Database migration failed.\"\n\n# Claude needs your decision\n\"Sir, I need your input on project authentication. Email or SMS for password reset?\"\n```\n\n## What is clarvis?\n\nclarvis is a TypeScript/Bun hook processor that transforms Claude Code's text updates into concise voice notifications. It's the bridge between your AI assistant and your ears:\n\n- **Processes Claude Code hooks** to capture assistant messages\n- **Summarizes with LLMs** to create Jarvis-style updates (1-3 sentences based on mode)\n- **Speaks through lspeak** with first-person JARVIS personality (\"I have completed...\")\n- **Configurable per-project** via TOML with mode-based behavior\n- **Smart labeling** - \"project\" for development, \"topic\" for other modes\n\nThink of it as **ambient awareness for AI pair programming**. You can:\n- Work in another window while Claude runs tests\n- Get notified when tasks complete or fail\n- Hear decisions that need your attention\n- Stay in flow without watching the terminal\n\n## Status: Alpha\n\n**This is early software that works but has rough edges.** It's been in daily use for development work, handling Claude Code voice notifications reliably, but expect quirks and configuration challenges.\n\n## Known Issues \u0026 Limitations\n\n**Current limitations in v0.1.0:**\n\n- **Configuration complexity** - Requires manual setup of API keys and config.toml\n- **Claude instruction required** - Must manually tell Claude to output metadata lines (not automatic)\n- **First-run delay** - 27-second delay on first lspeak use while ML models load\n- **Multiple dependencies** - Requires Bun, lspeak, OpenAI API key (ElevenLabs optional for better voice)\n\n**What works well:**\n- Reliable voice notifications for development workflow\n- Multiple TTS providers (ElevenLabs, system TTS)\n- Mode-based verbosity control\n- Clean error handling with fallback audio\n\n## Installation\n\n### Prerequisites\n\n```bash\n# Install Bun (JavaScript runtime)\ncurl -fsSL https://bun.sh/install | bash\n\n# Install lspeak (TTS engine)\nuv tool install git+https://github.com/nickpending/lspeak.git\n\n# Install pnpm (package manager)\nnpm install -g pnpm\n```\n\n### Install clarvis\n\n**Manual Install (Currently Required):**\n\n```bash\n# Clone repository\ngit clone https://github.com/nickpending/clarvis.git\ncd clarvis\n\n# Run installer\nchmod +x install.sh\n./install.sh\n```\n\n**Development Install:**\n\n```bash\n# Clone and link for development\ngit clone https://github.com/nickpending/clarvis.git\ncd clarvis\npnpm install\nbun run build\nsudo ln -sf \"$(pwd)/dist/index.js\" /usr/local/bin/clarvis\n```\n\n### Configure Claude Code\n\nClaude Code hooks allow you to run commands when specific events occur. The \"Stop\" hook triggers when Claude finishes a message, making it perfect for voice notifications.\n\nAdd to `~/.claude/settings.json`:\n\n```json\n{\n  \"hooks\": {\n    \"Stop\": [{\n      \"matcher\": \"\",\n      \"hooks\": [{\n        \"type\": \"command\",\n        \"command\": \"cat | clarvis\"\n      }]\n    }]\n  }\n}\n```\n\n**How it works:**\n- `\"Stop\"` - Triggers when Claude stops speaking (message complete)\n- `\"matcher\": \"\"` - Empty matcher means it runs on every message\n- `\"command\": \"cat | clarvis\"` - Pipes the hook data (JSON) to clarvis\n\n**Testing the hook:**\n1. Open Claude Code after saving settings.json\n2. Ask Claude a simple question\n3. You should hear a voice notification when Claude responds\n\n**Temporarily disable voice:**\n```bash\n# Disable for current terminal session only\nexport CLARVIS_VOICE=off\n\n# Re-enable by unsetting or opening new terminal\nunset CLARVIS_VOICE\n```\n\n## Configuration\n\nCreate `~/.config/clarvis/config.toml`:\n\n```toml\n# Mode definitions - control verbosity\n[modes.default]\nstyle = \"terse\"  # 1 sentence, 5-10 words\n\n[modes.development]\nstyle = \"brief\"  # 2 short sentences\n\n[modes.writing]\nstyle = \"normal\"  # 3 natural sentences\n\n[modes.research]\nstyle = \"normal\"  # 3 natural sentences\n\n[modes.conversation]\nstyle = \"normal\"  # 3 natural sentences\n\n# LLM provider (for summarization)\n[llm]\nprovider = \"openai\"  # or \"ollama\"\nmodel = \"gpt-4o-mini\"  # or local model\napiKey = \"sk-...\"  # Required for OpenAI\n\n# JARVIS base instruction (applied to all modes)\nbase_instruction = \"\"\"\nYou are J.A.R.V.I.S., providing status updates about work in progress.\n\nSPEAK AS JARVIS IN FIRST PERSON:\n- If you see \"Project: api\" → \"Sir, I have [status] project api\" \n- If you see \"Topic: documentation\" → \"Sir, I have [status] the documentation\"\n- Always use \"I\" - you are JARVIS doing the work\n\nSPEECH FORMATTING for TTS:\nSpell out: API→A P I, JWT→J W T, URL→U R L, HTTP→H T T P\nPronounce: JSON→jason, SQL→sequel, OAuth→oh-auth\nNumbers: 8080→eight zero eight zero\n\"\"\"\n\n[llm.prompts]\nterse = \"Follow all steps above. One sentence, 5-10 words maximum. Just the core status.\"\nbrief = \"Follow all steps above. EXACTLY 2 short sentences. First: Status in 5-8 words. Second: One key detail in 8-12 words. Keep both sentences brief for speech.\"\nnormal = \"Follow all steps above. 3 sentences that flow naturally. Include status, key details, and outcome/impact. Keep it conversational but concise.\"\nfull = \"Follow all steps above. Full JARVIS response with complete details, no length limit. Preserve ALL content but make it speakable.\"\nbypass = \"This style bypasses LLM processing entirely - raw Claude output goes directly to TTS.\"\n\n# Voice configuration\n[voice]\nprovider = \"elevenlabs\"  # or \"system\" for free TTS\napi_key = \"sk_...\"  # ElevenLabs API key\nvoice_id = \"YOUR_VOICE_ID_HERE\"  # Create a JARVIS-style voice in ElevenLabs\ncache_threshold = 0.90\n\n# Notes:\n# - For ElevenLabs: Create a custom voice at https://elevenlabs.io/ and use its ID\n# - For system TTS: Uses your default system voice (ignore voice_id)\n# - System TTS uses whatever voice you've selected in System Preferences\n```\n\n## Quick Start\n\nOnce installed and configured:\n\n### Step 1: Configure Claude to Output Metadata\n\nFor clarvis to work, Claude needs to output metadata lines that specify the mode and project. Add this instruction to your Claude conversations:\n\n\u003e **Claude, please start each of your responses with a metadata line in this format:**\n\u003e `clarvis:[mode:MODE project:PROJECT_NAME]`\n\u003e\n\u003e **Available modes:**\n\u003e - `default` - Minimal updates (1 sentence)\n\u003e - `development` - Brief technical updates (2 sentences)\n\u003e - `writing` - Detailed progress for documentation work (3 sentences)\n\u003e - `research` - Detailed analysis summaries (3 sentences)\n\u003e - `conversation` - Natural discussion flow (3 sentences)\n\u003e\n\u003e **Example:** `clarvis:[mode:development project:auth-system]`\n\n### Step 2: Use Claude Code Normally\n\n1. **Start Claude Code** and work normally\n2. **Claude speaks** when tasks complete via clarvis → lspeak\n3. **Verbosity is controlled** by the metadata Claude outputs:\n\n```markdown\nclarvis:[mode:development project:api]\nLet's implement the authentication system...\n```\n\nThe metadata line controls:\n- `mode`: Selects configuration mode and verbosity:\n  - `default` → terse (1 sentence, 5-10 words)\n  - `development` → brief (2 short sentences) \n  - `writing`/`research`/`conversation` → normal (3 natural sentences)\n- `project`: Names the work being done:\n  - Development mode: \"Sir, I have completed **project api**\"\n  - Other modes: \"Sir, I have finished **the documentation**\"\n\n## Voice Personality\n\nclarvis speaks as JARVIS in first person, as if he's your AI assistant doing the work:\n\n- **First-person speech**: \"Sir, I have completed...\" not \"Sir, project is complete\"\n- **Smart labeling**: Uses \"project\" for development, natural phrasing for other modes\n- **Speech formatting**: Automatically formats technical terms for TTS (API → \"A P I\", JSON → \"jason\")\n- **Contextual responses**: Different phrasing for errors, completions, questions, and findings\n\n## How It Works\n\nclarvis processes Claude Code hook events through this pipeline:\n\n```\n┌─────────────┐     ┌──────────┐     ┌────────────┐\n│ Claude Code │────▶│   Hook   │────▶│ Transcript │\n│    Event    │     │  Parser  │     │ Extractor  │\n└─────────────┘     └──────────┘     └────────────┘\n                                            │\n                                      Parse Metadata\n                                      (mode, project)\n                                            │\n                                    ┌───────▼────────┐\n                                    │  LLM Summary   │\n                                    │ (OpenAI/Ollama)│\n                                    └───────┬────────┘\n                                            │\n                                      Jarvis-style\n                                       2 sentences\n                                            │\n                                    ┌───────▼────────┐\n                                    │     lspeak     │\n                                    │  (TTS + Cache) │\n                                    └────────────────┘\n                                            │\n                                         🔊 Audio\n```\n\n### Architecture Components\n\n1. **Hook Parser** - Reads JSON from stdin with timeout protection\n2. **Transcript Extractor** - Gets last assistant message from JSONL\n3. **Metadata Parser** - Extracts `clarvis:[...]` control data  \n4. **LLM Summarizer** - Creates Jarvis-style summaries\n5. **Speaker** - Calls lspeak with voice configuration\n\n### Processing Modes\n\n- **silent**: No output at all\n- **terse**: 1 sentence summary\n- **brief**: 2 sentence summary (default)\n- **normal**: 3 sentence paragraph\n- **full**: Complete JARVIS response, no length limit\n- **bypass**: Pass-through without summarization\n\n## Real-World Usage\n\n### Project-Specific Modes\n\nControl verbosity per project type:\n\n```markdown\n# Quick bug fix - minimal interruption (brief style)\nclarvis:[mode:development project:bugfix]\nFix the null pointer exception in auth...\n\n# Research work - more detail (normal style)\nclarvis:[mode:research project:payment-system]\nImplement Stripe payment integration...\n\n# Writing/docs - full detail (normal style)\nclarvis:[mode:writing project:docs]\nUpdate the API documentation...\n```\n\n### Error Notifications\n\nErrors always speak (never silent) using system TTS fallback:\n\n```typescript\n// Even if config is broken, you'll hear:\n\"Sir, processing failed.\"\n```\n\n### Multi-Session Support\n\nEach project/topic identified in first-person updates:\n- \"Sir, I have completed project **auth**.\"\n- \"Sir, I have encountered an error with project **API**.\"\n- \"Sir, I need your input on the **architecture** discussion.\"\n\n## Performance\n\nWith Bun runtime and lspeak caching:\n\n- **Startup**: \u003c100ms (20x faster than Python)\n- **Hook processing**: \u003c2 seconds total\n- **Cached phrases**: Instant via lspeak\n- **New phrases**: 1-2 seconds for LLM + TTS\n\n## Requirements\n\n- Bun 1.2+\n- Node.js 20+ (for pnpm)\n- lspeak installed\n- OpenAI API key (or Ollama running)\n- ElevenLabs API key (optional, for premium voices)\n\n## Architecture Decisions\n\n**Why TypeScript/Bun?**  \n20x faster startup than Python. Critical for \u003c2 second hook processing limit.\n\n**Why metadata in transcript?**  \nClaude Code hooks don't pass message metadata. Embedding in message is the only way.\n\n**Why delegate to lspeak?**  \nSemantic caching and queue management are complex. lspeak solves this perfectly.\n\n**Why LLM summarization?**  \nClaude's responses are too verbose for speech. Jarvis-style summaries are perfect for audio.\n\n## Troubleshooting\n\n**\"Sir, processing failed\" constantly**\n- Check `~/.config/clarvis/config.toml` exists\n- Verify LLM provider is accessible\n- Check API keys are valid\n\n**No voice output**\n- Verify lspeak is installed: `which lspeak`\n- Test lspeak directly: `echo \"test\" | lspeak`\n- Check mode isn't set to \"silent\"\n\n**Slow processing**\n- First call loads lspeak models (27 seconds)\n- Subsequent calls should be instant\n- Check Ollama server if using local LLM\n\n## Roadmap\n\n**v0.2.0** (Next release):\n- [ ] Automatic Claude instruction injection (no manual metadata setup)\n- [ ] Simplified config with sensible defaults\n- [ ] Better error messages with specific workarounds\n- [ ] Config validation and helpful setup wizard\n\n**v0.3.0** (Future):\n- [ ] Support for different AI assistants beyond Claude Code\n- [ ] Improved metadata extraction patterns\n- [ ] Voice customization per project/mode\n\n## Contributing\n\nPRs welcome! Core principles:\n\n- Keep it fast - respect the 2-second hook limit\n- Configuration over code - everything in TOML\n- Delegate complexity - use lspeak for TTS/caching\n- Clear errors - always provide voice feedback\n\n## License\n\nMIT - See [LICENSE](LICENSE)\n\n## Credits\n\nBuilt to give Claude Code a voice through:\n- **lspeak** - Semantic caching and TTS orchestration\n- **Claude Code** - The AI pair programmer worth talking to\n- **Bun** - Blazing fast JavaScript runtime\n- **The original Bash prototype** - 300 lines that proved this works","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnickpending%2Fclarvis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnickpending%2Fclarvis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnickpending%2Fclarvis/lists"}