{"id":34847560,"url":"https://github.com/hackjutsu/vibe-speech","last_synced_at":"2026-05-06T14:31:43.226Z","repository":{"id":326329617,"uuid":"1105053890","full_name":"hackjutsu/vibe-speech","owner":"hackjutsu","description":"Local near real-time voice AI assistant","archived":false,"fork":false,"pushed_at":"2025-12-08T00:27:11.000Z","size":66,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-08T12:01:17.742Z","etag":null,"topics":["ai","aiassistant","chatbot","llm","ollama","python","whisper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hackjutsu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-27T04:26:09.000Z","updated_at":"2025-12-08T00:27:15.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hackjutsu/vibe-speech","commit_stats":null,"previous_names":["hackjutsu/vibe-speech"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hackjutsu/vibe-speech","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hackjutsu%2Fvibe-speech","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hackjutsu%2Fvibe-speech/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hackjutsu%2Fvibe-speech/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hackjutsu%2Fvibe-speech/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hackjutsu","download_url":"https://codeload.github.com/hackjutsu/vibe-speech/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hackjutsu%2Fvibe-speech/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32698098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-06T08:33:17.875Z","status":"ssl_error","status_checked_at":"2026-05-06T08:33:17.221Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","aiassistant","chatbot","llm","ollama","python","whisper"],"created_at":"2025-12-25T18:52:04.729Z","updated_at":"2026-05-06T14:31:43.220Z","avatar_url":"https://github.com/hackjutsu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Vibe Speech (WIP)\n\nCross-platform, local-first voice helper that listens to the mic, runs Whisper locally, and feeds the transcript to a local LLM assistant. The assistant replies with your configured personality and the response is spoken aloud via `xsst2`. Designed for macOS and Windows, with a fallback automation layer that works anywhere `pyautogui` does.\n\n## Current status\n- Push-to-talk hotkey with tail padding; buffers while held, then transcribes once and routes the transcript to an LLM assistant.\n- Faster-Whisper backend (configurable model; defaults to `large-v3-turbo` unless you change it). Beam size, language, and an optional initial prompt are configurable.\n- Optional rewriter (Ollama or llama.cpp) for grammar polish before the assistant sees the text.\n- Assistant replies are generated by a local/Ollama LLM with a customizable personality and spoken through `xsst2`.\n- Assistant prompts can include a configurable window of recent conversation history (`assistant.history_length`).\n- Spinner/colored timing logs so you can see when transcription/rewriting/assistant work is running.\n\n## Architecture (plain text)\n```\nMic -\u003e AudioCapture (chunk/tail) -\u003e WhisperEngine (transcribe)\n    -\u003e Processor (cleanup/correct + optional Rewriter)\n    -\u003e Assistant (LLM; uses conversation history window)\n    -\u003e SpeechSynthesizer (TTS) + Automation (typing)\n```\n\n## Project layout\n- `src/vibe_speech/cli.py` – entrypoint (`vibe-speech` script) with `serve` and `doctor`.\n- `src/vibe_speech/config.py` – config models and loader.\n- `src/vibe_speech/runtime.py` – audio capture, hotkey handling, buffering, transcription, output, and logging.\n- `src/vibe_speech/whisper_engine.py` – Whisper wrapper (faster-whisper).\n- `src/vibe_speech/automation.py` – text output automation (pyautogui).\n- `src/vibe_speech/processor.py` – processing modes (`raw`, cleanup, optional correction/rewriter).\n- `src/vibe_speech/rewriter.py` – optional grammar rewriter (Ollama/local llama.cpp).\n- `config.sample.yaml` – defaults for local development.\n\n## Quick start\n1) Python 3.11+, `ffmpeg` on PATH.\n2) Install: `python -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip install -e .`\n3) Copy and edit config: `cp config.sample.yaml config.yaml`\n   - Set `audio.device_name` to your mic.\n   - Choose a Whisper model (`whisper.model_size`), beam size, `initial_prompt` if desired.\n   - Set `whisper.remote_url` to offload transcription to your remote `/transcribe` service (leave empty to use local Whisper).\n   - Set the assistant provider/model/personality/history length in `assistant.*`; adjust `speech.*` if your `xsst2` path or args differ.\n   - Enable/disable the optional rewriter as needed.\n\n### Prompt shape (with history)\n```\nassistant.system_prompt\nPersonality: \u003cassistant.personality\u003e\n\n[last N user/assistant turns, up to assistant.history_length]\nUser: \u003cprevious user\u003e\nAssistant: \u003cprevious reply\u003e\n\nUser: \u003ccurrent user text\u003e\nAssistant:\n```\n4) Run: `vibe-speech --config config.yaml serve` (use `--dry-run` to log without speaking).\n5) Hold `ctrl+shift+space` (default) while speaking; release to transcribe, send to the assistant, and hear the reply. Spinner shows work in progress; logs include timing and raw/final text.\n\nVirtualenv activation (if you created `.venv` above):\n- macOS/Linux (bash/zsh): `source .venv/bin/activate`\n- Windows PowerShell: `.venv\\\\Scripts\\\\Activate.ps1`\n- Windows cmd: `.venv\\\\Scripts\\\\activate.bat`\n\n## Debug logging\n- CLI flags: `vibe-speech --log-level DEBUG --config config.yaml serve` (or `python -m vibe_speech.cli --log-level DEBUG --config config.yaml serve`).\n- Config file: set `log_level: DEBUG` in `config.yaml` and run normally.\n\n## Platform notes\n- macOS: grant Accessibility for your terminal/editor so typing works; mic permission for the terminal. Tail padding helps avoid clipping; adjust in config.\n- Windows/Linux: relies on `pyautogui` for typing; focus targeting not yet implemented.\n\n## Troubleshooting\n- Mic errors (AUHAL -50, etc.): set `audio.device_name` to a valid input from `sounddevice.query_devices()`, and ensure mic permission is granted.\n- Model downloads: set `whisper.offline: false` for the first run to cache; then flip to `true` for offline use.\n- Accuracy vs speed: smaller models/beam=1–3 for speed; larger/beam=5 for accuracy.\n\n## Notes\n- Streaming/partials are not implemented; the app buffers until hotkey release (with optional tail capture).\n- The rewriter can change phrasing; set `processing.mode: raw` and `rewriter.enabled: false` for unaltered Whisper output.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhackjutsu%2Fvibe-speech","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhackjutsu%2Fvibe-speech","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhackjutsu%2Fvibe-speech/lists"}