{"id":50483299,"url":"https://github.com/gabrimatic/realtime-operator","last_synced_at":"2026-06-01T19:30:51.869Z","repository":{"id":357468537,"uuid":"1237100612","full_name":"gabrimatic/realtime-operator","owner":"gabrimatic","description":"Local system-control voice agent built with the OpenAI Realtime API","archived":false,"fork":false,"pushed_at":"2026-05-12T22:35:16.000Z","size":44,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-13T00:09:13.784Z","etag":null,"topics":["local-first","nodejs","openai","realtime-api","system-automation","voice-agent","webrtc"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gabrimatic.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-12T21:55:31.000Z","updated_at":"2026-05-12T22:34:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/gabrimatic/realtime-operator","commit_stats":null,"previous_names":["gabrimatic/realtime-operator"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/gabrimatic/realtime-operator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabrimatic%2Frealtime-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabrimatic%2Frealtime-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabrimatic%2Frealtime-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabrimatic%2Frealtime-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gabrimatic","download_url":"https://codeload.github.com/gabrimatic/realtime-operator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gabrimatic%2Frealtime-operator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33790679,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["local-first","nodejs","openai","realtime-api","system-automation","voice-agent","webrtc"],"created_at":"2026-06-01T19:30:50.891Z","updated_at":"2026-06-01T19:30:51.860Z","avatar_url":"https://github.com/gabrimatic.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Realtime Operator\n\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![Runtime: Node 22+](https://img.shields.io/badge/node-22%2B-blue.svg)]()\n[![OpenAI Realtime](https://img.shields.io/badge/OpenAI-Realtime-green.svg)](https://platform.openai.com/docs/guides/realtime)\n[![Local tools](https://img.shields.io/badge/local-tools-lightgrey.svg)]()\n\nRealtime Operator is a local voice agent built on the OpenAI Realtime API.\n\nYou speak in the browser. The browser connects to a Realtime model over WebRTC. When the model needs to do something on your computer, it calls local function tools through the Node server: inspect the system, list files, search a project, read a safe text file, fetch a URL, run a bounded command, open a link, show a notification, or use the clipboard after confirmation.\n\nThe interesting part is the bridge.\n\nMost voice demos stop at conversation. This one gives the Realtime model hands, but keeps those hands local, narrow, logged, and approval-gated. The standard OpenAI API key stays on the server. The browser only gets a short-lived Realtime client secret.\n\nThe default model is `gpt-realtime-2`, with `marin` as the default voice. If your account uses a different Realtime model alias, change `realtime.model` in `config.json`.\n\n## Local Runtime\n\n| Path | Where it runs |\n|------|---------------|\n| Browser microphone and audio playback | Your browser |\n| Realtime voice conversation | OpenAI Realtime over WebRTC |\n| API key and client-secret minting | Local Node server |\n| Function tools | Local Node server |\n| Commands, files, clipboard, notifications | Your own machine |\n| Logs and transcripts | `~/.realtime-operator/` |\n\nThe server follows the OpenAI Realtime WebRTC pattern: the browser gets a short-lived client secret, opens a peer connection, and uses the `oai-events` data channel for conversation and function-call events.\n\n## At A Glance\n\n| Surface | What it does |\n|---------|--------------|\n| Voice UI | Starts a browser Realtime voice session, handles microphone setup, mute, disconnect, typed messages, and local status. |\n| Realtime session | Uses audio output, semantic VAD, input transcription, reasoning, and function tools. |\n| Local tools | System status, bounded commands, directory listing, file metadata, redacted file reads, file search, URL fetches, open URL/file, notifications, and clipboard. |\n| Safety layer | Token-backed local API, allowed roots, sensitive-path detection, output redaction, risky-action confirmation, bounded command timeouts, and local logs. |\n| Tests | Node syntax checks, API smoke test, Playwright UI smoke test, and a repository secret scan. |\n\n## Quick Start\n\nRequirements: Node.js 22+, an OpenAI API key, and a browser with microphone and WebRTC support.\n\n```bash\ngit clone https://github.com/gabrimatic/realtime-operator.git\ncd realtime-operator\nnpm install\ncp config.example.json config.json\n```\n\nPut your API key in the environment:\n\n```bash\nexport OPENAI_API_KEY=\"...\"\n```\n\nOr store it in the default local file:\n\n```bash\nmkdir -p ~/.realtime-operator\nprintf '%s\\n' \"...\" \u003e ~/.realtime-operator/openai-api-key.secret.local\nchmod 600 ~/.realtime-operator/openai-api-key.secret.local\n```\n\nStart the server:\n\n```bash\n./start.sh\n```\n\nOpen:\n\n```text\nhttp://127.0.0.1:49376/\n```\n\nThen press `Start Talking`.\n\n## Features\n\n- **Realtime voice agent**: browser microphone to OpenAI Realtime, with audio responses back through WebRTC.\n- **Local function tools**: the model can inspect and act through the local server when a spoken request needs real machine context.\n- **Bounded command runner**: supports exact `argv`, shell `command`, or multiline `script`, with timeouts and risky-action gates.\n- **File tools**: list directories, inspect metadata, read redacted text files, and search with ripgrep under configured allowed roots.\n- **Network helper**: fetch a bounded URL response for local API checks or public GET requests.\n- **Desktop actions**: open a URL/file, show macOS notifications, and read or set clipboard only after confirmation.\n- **Approval gates**: destructive, publishing, credential-related, service-changing, payment, messaging, and sensitive commands return `approval_required`.\n- **Redaction**: keys, bearer tokens, GitHub tokens, credentials, and sensitive output are redacted before logs or tool output.\n- **Phone pairing path**: private-network access is off by default. If enabled, pair with a local code instead of putting the standard API key in the browser.\n\n## Why Function Tools\n\nOpenAI Realtime supports function calling during a live conversation. The model emits tool-call arguments, the client executes custom code, then sends tool output back into the conversation and asks the model to respond.\n\nThat fits this project well because the machine-control part should stay local. Remote MCP is useful when the tool server is reachable from OpenAI. Local shell, files, clipboard, and desktop actions are private machine boundaries, so this project exposes them as Realtime function tools behind a local server and approval layer.\n\nRelevant OpenAI docs:\n\n- [Realtime API overview](https://platform.openai.com/docs/guides/realtime)\n- [Realtime API with WebRTC](https://platform.openai.com/docs/guides/realtime-webrtc)\n- [Realtime conversations and function calling](https://platform.openai.com/docs/guides/realtime-model-capabilities)\n- [Realtime client secrets](https://platform.openai.com/docs/api-reference/realtime-sessions/create-realtime-client-secret)\n\n## Configuration\n\nCopy `config.example.json` to `config.json`. The default local data directory is:\n\n```text\n~/.realtime-operator/\n```\n\nImportant settings:\n\n| Setting | Default | Notes |\n|---------|---------|-------|\n| `realtime.model` | `gpt-realtime-2` | Change this if your account uses another Realtime model alias. |\n| `realtime.voice` | `marin` | Any supported Realtime voice can be configured. |\n| `system.allowedRoots` | `[\"~\"]` | File and command working directories must stay inside these roots. |\n| `network.trustPrivateClients` | `false` | Keep false unless you understand the LAN trust tradeoff. |\n| `safety.requireConfirmationForRiskyTasks` | `true` | Keep true for real use. |\n| `logging.includeToolPayloads` | `false` | Leave false unless you need deeper debugging. |\n\nMore detail is in [docs/configuration.md](docs/configuration.md).\n\n## Safety Model\n\nRealtime Operator is powerful because it can touch the local machine. The safety model is intentionally practical:\n\n- The standard OpenAI API key never goes to the browser.\n- The local API is token-backed.\n- Private LAN clients are not trusted by default.\n- Tools can only work under configured allowed roots.\n- Sensitive-looking files require confirmation before reading.\n- Risky commands require a confirmation challenge before they run.\n- Command output and logs are redacted.\n- Commands are bounded by timeout and output size.\n- Clipboard tools require confirmation every time.\n\nThis is still a local system-control app. Read the code and configuration before pointing it at important directories.\n\n## Development\n\n```bash\nnpm install\nnpm test\n```\n\nIndividual checks:\n\n```bash\nnpm run check\nnpm run smoke\nnpm run ui:smoke\nnpm run secret:scan\n```\n\n`npm run ui:smoke` starts a throwaway local server with temp config and state, then runs Playwright against it. It does not use your real API key or local data directory.\n\n## Project Status\n\nThis is an early open-source release. The core path is intentionally small: Realtime voice, local tools, safety gates, and documentation that explains the architecture. The next useful work is packaging, richer tool presets, and a cleaner install flow.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgabrimatic%2Frealtime-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgabrimatic%2Frealtime-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgabrimatic%2Frealtime-operator/lists"}