{"id":47699038,"url":"https://github.com/mcerqua/openvoiceui","last_synced_at":"2026-05-30T17:00:45.792Z","repository":{"id":340878056,"uuid":"1166415977","full_name":"MCERQUA/OpenVoiceUI","owner":"MCERQUA","description":"Voice-powered AI assistant platform — connect any LLM, any TTS, with a live web canvas, music generation, and agent orchestration using openclaw. Install: npx openvoiceui setup","archived":false,"fork":false,"pushed_at":"2026-05-23T00:38:05.000Z","size":10675,"stargazers_count":51,"open_issues_count":21,"forks_count":10,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-05-26T02:47:33.388Z","etag":null,"topics":["ai-agent","ai-assistant","ai-canvas","ai-voice","docker","flask","llm","music-generation","openclaw","pinokio","self-hosted","speech-to-text","stt","text-to-speech","tts","voice-agent","voice-ai","voice-assistant","voice-interface"],"latest_commit_sha":null,"homepage":"https://openvoiceui.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MCERQUA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-25T07:44:53.000Z","updated_at":"2026-05-19T04:53:55.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/MCERQUA/OpenVoiceUI","commit_stats":null,"previous_names":["mcerqua/openvoiceui-public"],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/MCERQUA/OpenVoiceUI","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCERQUA%2FOpenVoiceUI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCERQUA%2FOpenVoiceUI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCERQUA%2FOpenVoiceUI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCERQUA%2FOpenVoiceUI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MCERQUA","download_url":"https://codeload.github.com/MCERQUA/OpenVoiceUI/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MCERQUA%2FOpenVoiceUI/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33700863,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","ai-assistant","ai-canvas","ai-voice","docker","flask","llm","music-generation","openclaw","pinokio","self-hosted","speech-to-text","stt","text-to-speech","tts","voice-agent","voice-ai","voice-assistant","voice-interface"],"created_at":"2026-04-02T17:00:44.015Z","updated_at":"2026-05-30T17:00:45.786Z","avatar_url":"https://github.com/MCERQUA.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/banner.jpg\" alt=\"OpenVoiceUI Banner\" width=\"100%\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eOpenVoiceUI\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cstrong\u003eThe open-source voice AI that actually does work.\u003c/strong\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://www.npmjs.com/package/openvoiceui\"\u003e\u003cimg src=\"https://img.shields.io/npm/v/openvoiceui?style=flat-square\u0026color=3b82f6\" alt=\"npm version\" /\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue?style=flat-square\" alt=\"MIT License\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/MCERQUA/OpenVoiceUI/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/MCERQUA/OpenVoiceUI?style=flat-square\u0026color=06b6d4\" alt=\"GitHub Stars\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://openvoiceui.com\"\u003e\u003cimg src=\"https://img.shields.io/badge/website-openvoiceui.com-0f172a?style=flat-square\" alt=\"Website\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  Install, open \u003ccode\u003elocalhost:5001\u003c/code\u003e, say \u003cem\u003e\"build me a dashboard\"\u003c/em\u003e, and watch it render live.\n\u003c/p\u003e\n\n---\n\n\u003e **[Watch the demo](https://openvoiceui.com)** -- see voice-to-canvas in action\n\n---\n\n## Install\n\n**Prerequisite: [Docker](https://docs.docker.com/get-docker/) must be installed and running for all install methods.**\n\n### Pinokio (one-click)\n\nDownload [Pinokio](https://pinokio.co) if you don't have it, then search **\"OpenVoiceUI\"** in the app store and click **Install**.\n\n### npm\n\n```bash\nnpx openvoiceui setup     # interactive wizard — walks you through API keys + builds Docker images\nnpx openvoiceui start     # starts everything\n```\n\n### Docker\n\n```bash\ngit clone https://github.com/MCERQUA/OpenVoiceUI.git\ncd OpenVoiceUI\ncp .env.example .env        # edit with your API keys\ndocker compose up\n```\n\nOpen **localhost:5001** and start talking.\n\n---\n\n## What is OpenVoiceUI?\n\nOpenVoiceUI is a hands-free, AI-controlled computer. You talk — it builds. Live web apps, dashboards, games, full websites — rendered in real time while you watch. No mouse, no keyboard, no typing prompts into a chat box.\n\nIt runs on [OpenClaw](https://openclaw.org) and works with any LLM. The AI agent can build and display apps mid-conversation, switch between projects with a voice command, generate music on the fly, delegate work to parallel sub-agents, and remember everything across sessions. It uses any [Claude Code](https://docs.anthropic.com/en/docs/claude-code) or [OpenClaw](https://openclaw.org) skill — and the community can build and share more through the plugin system.\n\nSelf-hosted. Your hardware, your data. MIT licensed, forever free.\n\n## Core Features\n\n- **Hands-Free AI Computer** — Talk and watch it work. The AI builds apps, switches between projects, runs tasks, and displays results on a live visual canvas — all without touching a mouse or keyboard.\n- **Live Canvas** — AI renders real HTML pages mid-conversation: dashboards, tools, galleries, reports, full web apps. Not text responses — real interactive pages you can use.\n- **AI Music Generation** — Generate songs on the fly with your voice using Suno. Full music player with playlist management built in.\n- **Custom Animated Interface** — Choose from animated face modes (eye-face avatar, reactive halo-smoke orb) or install community-built faces through plugins. Build your own — the face system is fully extensible.\n- **Sub-Agents** — Delegate multiple tasks to parallel AI workers simultaneously and get results back.\n- **Long-Term Memory** — Optional context engine plugin curates knowledge every turn. Persists across sessions in human-readable markdown.\n- **Desktop OS Interface** — Themed desktop environment with window management (Windows XP, macOS, Ubuntu, Win95, Win 3.1).\n- **Admin Dashboard** — Mobile-responsive. Agent profiles, provider config, workspace file browser, plugin management, system health. Everything editable live.\n- **Self-Hosted** — Your hardware, your data. No vendor lock-in, no monthly fees.\n\n## And More\n\n- Image generation (FLUX.1, Stable Diffusion 3.5)\n- Video creation (Remotion Studio)\n- Voice cloning (Qwen3-TTS via fal.ai)\n- Cron jobs for scheduled automation\n- File explorer with drag-and-drop\n- Agent profiles — switch personas, voices, and LLM providers from the admin panel\n\n---\n\n## Install Details\n\n### Option 1: Pinokio (one-click)\n\n1. Install [Pinokio](https://pinokio.co) if you don't have it\n2. Search **\"OpenVoiceUI\"** in the Pinokio app store\n3. Click **Install**, then **Start**\n\nPinokio handles Docker, dependencies, and configuration automatically.\n\n### Option 2: npm\n\nRequires **Node.js 20+**, **Python 3.10+**, and **Docker**.\n\n```bash\nnpx openvoiceui setup     # interactive wizard — configures LLM, TTS, API keys, builds Docker images\nnpx openvoiceui start     # starts OpenClaw gateway + Supertonic TTS + voice UI\n```\n\nThe setup wizard walks you through choosing an LLM provider, TTS provider, and entering API keys. Configuration is saved to `.env` and `openclaw-data/`.\n\n```bash\nnpx openvoiceui stop      # stop all services\nnpx openvoiceui status    # check what's running\nnpx openvoiceui logs      # tail service logs\n```\n\n### Option 3: Docker\n\nRequires **Docker** and **Docker Compose**.\n\n```bash\ngit clone https://github.com/MCERQUA/OpenVoiceUI.git\ncd OpenVoiceUI\ncp .env.example .env\n```\n\nEdit `.env` with your API keys (at minimum: an LLM provider key and optionally a TTS key). Then:\n\n```bash\ndocker compose up -d\n```\n\nThis starts three containers:\n\n| Container | Port | Purpose |\n|-----------|------|---------|\n| `openclaw` | 18791 | LLM gateway — routes to your chosen LLM provider |\n| `supertonic` | (internal) | Free local TTS — no API key needed |\n| `openvoiceui` | 5001 | Voice UI + Canvas + Admin dashboard |\n\nOpen **http://localhost:5001** to use the voice interface, or **http://localhost:5001/admin** for the admin dashboard.\n\nTo stop: `docker compose down`\n\n### Option 4: VPS / Production\n\nFor running on an Ubuntu server with nginx and systemd:\n\n```bash\ngit clone https://github.com/MCERQUA/OpenVoiceUI.git\ncd OpenVoiceUI\ncp .env.example .env               # edit with your API keys\nsudo bash deploy/setup-sudo.sh     # creates dirs, installs systemd service\nbash deploy/setup-nginx.sh         # generates nginx config (edit domain)\n```\n\nSee [`deploy/`](deploy/) for the full production setup including SSL, nginx reverse proxy, and systemd service files.\n\n---\n\n## Configuration\n\nAll configuration is in `.env`. Copy `.env.example` to `.env` and fill in your values.\n\n**Required:**\n- An LLM provider API key (OpenAI, Anthropic, Groq, Z.AI, or any OpenClaw-compatible provider)\n- `CLAWDBOT_AUTH_TOKEN` — set during `npx openvoiceui setup` or in OpenClaw's setup wizard\n\n**Optional but recommended:**\n- `GROQ_API_KEY` — enables Groq Orpheus TTS (fast, high quality, free tier)\n- `SUNO_API_KEY` — enables AI music generation\n- `CLERK_PUBLISHABLE_KEY` — enables login/auth (for multi-user or public deployments)\n\nSee [`.env.example`](.env.example) for all available options with descriptions.\n\n---\n\n## Works With Any Provider\n\n**LLM**\n\n| Provider | Status |\n|----------|--------|\n| OpenClaw Gateway | Built-in — routes to OpenAI, Anthropic, Groq, Z.AI, and more |\n| Z.AI (GLM-5-turbo) | Built-in |\n| Groq (Llama, Qwen) | Via OpenClaw |\n| Google Gemini | Via OpenClaw |\n| MiniMax | Via OpenClaw |\n| Ollama (local) | Via adapter |\n| Any LLM | Drop-in gateway plugin |\n\n**Text-to-Speech**\n\n| Provider | Status |\n|----------|--------|\n| Supertonic (local) | Free, ships with Docker setup |\n| Groq Orpheus | Fast cloud TTS, free tier |\n| Resemble AI | Premium cloned voices |\n| Qwen3-TTS (fal.ai) | Voice cloning |\n| Hume EVI | Emotion-aware |\n| ElevenLabs | High quality, many voices |\n\n**Speech-to-Text**\n\n| Provider | Status |\n|----------|--------|\n| Web Speech API | Free, browser-native (default) |\n| Deepgram | Streaming, accurate |\n| Groq Whisper | Fast cloud transcription |\n\n---\n\n## Admin Dashboard\n\nAccess at **localhost:5001/admin**. Mobile-responsive.\n\n- **Profiles** — View and activate agent personas\n- **Agent Editor** — Edit name, voice, LLM provider, system prompt, features, and agent workspace files. 4 tabs: Profile, System Prompt, Features, Agent Files\n- **Plugins** — Install and manage face packs, gateways, and extensions\n- **Canvas Pages** — Toggle public/private, lock pages, delete with archive\n- **Workspace Files** — Browse and edit agent workspace. Audio playback, image preview built in.\n- **Music (Suno)** — View all generated songs, play inline, archive tracks\n- **Provider Config** — Select LLM, TTS, STT providers. Saves to active profile.\n- **Health and Stats** — CPU, RAM, disk, gateway status, session reset\n- **Connector Tests** — 12 automated endpoint diagnostics\n\n---\n\n## Use Cases\n\n**Small Business** — AI receptionist, appointment scheduler, report builder. Talk to your AI and get a live dashboard of today's leads, reviews, and tasks.\n\n**Digital Agencies** — Deploy custom AI assistants per client. Multi-tenant ready. Each client gets their own voice-powered workspace.\n\n**Developers** — Fork it, extend it, deploy it anywhere. MIT licensed. Build custom plugins, gateway adapters, and canvas pages on top of a voice-first platform.\n\n---\n\n## How It's Different\n\n| | OpenVoiceUI | Typical Voice AI |\n|---|---|---|\n| **Source** | Open source (MIT) | Closed source |\n| **Canvas UI** | Live HTML rendering | Text/audio only |\n| **Skills** | Any Claude Code or OpenClaw skill | API endpoints |\n| **Music** | AI music generation (Suno) | None |\n| **Memory** | Plugin-based long-term context | Session only |\n| **Admin** | Full dashboard, mobile-ready | Config files |\n| **Plugins** | Community face packs, pages, workflows | None |\n| **Hosting** | Self-hosted, your data | Vendor cloud only |\n| **Pricing** | Free forever | Per-minute billing |\n\n---\n\n## Tech Stack\n\n| Layer | Technology |\n|-------|-----------|\n| Backend | Python / Flask |\n| Frontend | Vanilla JS (ES modules, no framework) |\n| Canvas | Fullscreen iframe + SSE |\n| STT | Web Speech API, Deepgram, Groq Whisper |\n| TTS | Supertonic, Groq Orpheus, Resemble, Qwen3-TTS |\n| LLM | Any provider via OpenClaw gateway |\n| Memory | Context engine plugin (markdown knowledge base) |\n| Auth | Clerk (optional) |\n| Deploy | npm, Docker, Pinokio, VPS/systemd |\n\n---\n\n## Plugins\n\nOpenVoiceUI has a plugin system for community-built extensions. Plugins can include animated face packs, canvas pages, workflow dashboards, gateway adapters, or any combination.\n\n| Plugin | Type | Description |\n|--------|------|-------------|\n| [**BHB Animated Characters**](https://github.com/MCERQUA/openvoiceui-plugins) | Face Pack | Animated BigHead Billionaires character avatars with lip-sync, mood expressions, and show lore. By [BHaleyart](https://github.com/BHALEYART) |\n| [**Hermes Agent**](https://github.com/MCERQUA/openvoiceui-plugins/tree/main/hermes-agent) | Gateway | Self-improving AI agent ([Hermes v0.13.0 / `nousresearch/hermes-agent:v2026.5.7`](https://github.com/NousResearch/hermes-agent/releases/tag/v2026.5.7), pegged — never `:latest`) with auto-generated skills, deep memory search, autonomous tasks, multi-agent Kanban, goal-locking, video analysis, voice cloning. Adds OpenClaw+Hermes hybrid and Hermes-only modes |\n| **SEO Platform** | Canvas Page | Full SEO dashboard powered by DataForSEO — keyword research, rank tracking, backlink analysis, site audits, AI visibility, and local SEO |\n| **Twenty CRM** | Canvas Page | Connect to a Twenty CRM instance for contact, company, deal, and task management with embedded CRM view and setup wizard |\n\n**Build your own.** Face packs, canvas pages, workflow dashboards, gateway adapters ([template](plugins/README.md)), or STT/TTS adapters ([template](src/adapters/_template.js)). See the [plugins repo](https://github.com/MCERQUA/openvoiceui-plugins) for submission guidelines.\n\n---\n\n## Documentation\n\n- [Introduction](docs/intro.md)\n- [Environment Variables](.env.example)\n- [Plugin Development](plugins/README.md)\n- [Contributing](CONTRIBUTING.md)\n- [Website](https://openvoiceui.com)\n\n## Contributing\n\nWe welcome contributions — especially plugins. Build a face pack, a canvas page, a workflow dashboard, or a full extension and submit it to the [plugins repo](https://github.com/MCERQUA/openvoiceui-plugins). See [CONTRIBUTING.md](CONTRIBUTING.md) for code contribution guidelines and [openvoiceui.com](https://openvoiceui.com) for full documentation.\n\n## License\n\n[MIT](LICENSE)\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://openvoiceui.com\"\u003eWebsite\u003c/a\u003e \u0026nbsp;\u0026middot;\u0026nbsp;\n  \u003ca href=\"https://github.com/MCERQUA/OpenVoiceUI\"\u003eGitHub\u003c/a\u003e \u0026nbsp;\u0026middot;\u0026nbsp;\n  \u003ca href=\"https://www.npmjs.com/package/openvoiceui\"\u003enpm\u003c/a\u003e \u0026nbsp;\u0026middot;\u0026nbsp;\n  \u003ca href=\"https://github.com/MCERQUA/openvoiceui-plugins\"\u003ePlugins\u003c/a\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmcerqua%2Fopenvoiceui","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmcerqua%2Fopenvoiceui","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmcerqua%2Fopenvoiceui/lists"}