{"id":48643916,"url":"https://github.com/RunanywhereAI/RCLI","last_synced_at":"2026-04-25T18:00:38.910Z","repository":{"id":342124513,"uuid":"1172340808","full_name":"RunanywhereAI/RCLI","owner":"RunanywhereAI","description":"Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG","archived":false,"fork":false,"pushed_at":"2026-03-15T19:39:16.000Z","size":9877,"stargazers_count":1187,"open_issues_count":7,"forks_count":58,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-03-15T19:58:13.314Z","etag":null,"topics":["ai-assistant","apple-silicon","kitten-tts","kokoro-tts","lfm2","llama-cpp","llm","local-ai","metal","on-device-ai","parakeet","qwen3","rag","speech-to-text","text-to-speech","tool-calling","voice-assistant"],"latest_commit_sha":null,"homepage":"https://github.com/RunanywhereAI/runanywhere-sdks","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RunanywhereAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-04T07:41:17.000Z","updated_at":"2026-03-15T18:47:32.000Z","dependencies_parsed_at":"2026-03-15T10:04:04.471Z","dependency_job_id":null,"html_url":"https://github.com/RunanywhereAI/RCLI","commit_stats":null,"previous_names":["runanywhereai/rcli"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/RunanywhereAI/RCLI","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunanywhereAI%2FRCLI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunanywhereAI%2FRCLI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunanywhereAI%2FRCLI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunanywhereAI%2FRCLI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RunanywhereAI","download_url":"https://codeload.github.com/RunanywhereAI/RCLI/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunanywhereAI%2FRCLI/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32271243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T09:15:33.318Z","status":"ssl_error","status_checked_at":"2026-04-25T09:15:31.997Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-assistant","apple-silicon","kitten-tts","kokoro-tts","lfm2","llama-cpp","llm","local-ai","metal","on-device-ai","parakeet","qwen3","rag","speech-to-text","text-to-speech","tool-calling","voice-assistant"],"created_at":"2026-04-10T00:00:41.116Z","updated_at":"2026-04-25T18:00:38.904Z","avatar_url":"https://github.com/RunanywhereAI.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/rcli_waveform.gif\" alt=\"RCLI Waveform\" width=\"700\" /\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003eTalk to your Mac, query your docs, no cloud required.\u003c/strong\u003e\n  \u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://github.com/RunanywhereAI/RCLI\"\u003e\u003cimg src=\"https://img.shields.io/badge/platform-macOS-blue\" alt=\"macOS\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/RunanywhereAI/RCLI\"\u003e\u003cimg src=\"https://img.shields.io/badge/chip-Apple_Silicon-black\" alt=\"Apple Silicon\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/RunanywhereAI/RCLI\"\u003e\u003cimg src=\"https://img.shields.io/badge/inference-100%25_local-green\" alt=\"Local\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue\" alt=\"MIT\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n**RCLI** is an on-device voice AI for macOS. A complete STT + LLM + TTS + VLM pipeline running natively on Apple Silicon — 40 macOS actions via voice, local RAG over your documents, on-device vision (camera \u0026 screen analysis), sub-200ms end-to-end latency. No cloud, no API keys.\n\nPowered by [MetalRT](#metalrt-gpu-engine), a proprietary GPU inference engine built by [RunAnywhere, Inc.](https://runanywhere.ai) specifically for Apple Silicon.\n\n## Demo\n\n\u003e Real-time screen recordings on Apple Silicon — no cloud, no edits, no tricks.\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\u003cstrong\u003eVoice Conversation\u003c/strong\u003e\u003cbr\u003e\n\u003cem\u003eTalk naturally — RCLI listens, understands, and responds on-device.\u003c/em\u003e\u003cbr\u003e\u003cbr\u003e\n\u003ca href=\"https://youtu.be/qeardCENcV0\"\u003e\n\u003cimg src=\"assets/demos/demo1-voice-conversation.gif\" alt=\"Voice Conversation Demo\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\u003csub\u003eClick for full video with audio\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\u003cstrong\u003eApp Control\u003c/strong\u003e\u003cbr\u003e\n\u003cem\u003eControl Spotify, adjust volume — 38 macOS actions by voice.\u003c/em\u003e\u003cbr\u003e\u003cbr\u003e\n\u003ca href=\"https://youtu.be/eTYwkgNoaKg\"\u003e\n\u003cimg src=\"assets/demos/demo2-spotify-volume.gif\" alt=\"App Control Demo\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\u003csub\u003eClick for full video with audio\u003c/sub\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\u003cstrong\u003eModels\u003c/strong\u003e\u003cbr\u003e\n\u003cem\u003eBrowse models, hot-swap LLMs — all from the TUI.\u003c/em\u003e\u003cbr\u003e\u003cbr\u003e\n\u003ca href=\"https://youtu.be/HD1aS37zIGE\"\u003e\n\u003cimg src=\"assets/demos/demo3-benchmarks.gif\" alt=\"Models \u0026 Benchmarks Demo\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\u003csub\u003eClick for full video with audio\u003c/sub\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\u003cstrong\u003eDocument Intelligence (RAG)\u003c/strong\u003e\u003cbr\u003e\n\u003cem\u003eIngest docs, ask questions by voice — ~4ms hybrid retrieval.\u003c/em\u003e\u003cbr\u003e\u003cbr\u003e\n\u003ca href=\"https://youtu.be/8FEfbwS7cQ8\"\u003e\n\u003cimg src=\"assets/demos/demo4-rag-documents.gif\" alt=\"RAG Demo\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\u003csub\u003eClick for full video with audio\u003c/sub\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Install\n\n\u003e [IMPORTANT]\n\u003e **Requires macOS 13+ on Apple Silicon. MetalRT engine requires M3 or later.** M1/M2 Macs fall back to llama.cpp automatically.\n\n**One command:**\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/RunanywhereAI/RCLI/main/install.sh | bash\n```\n\n**Or via Homebrew:**\n\n```bash\nbrew tap RunanywhereAI/rcli https://github.com/RunanywhereAI/RCLI.git\nbrew install rcli\nrcli setup          # required — downloads AI models (~1GB, one-time)\n```\n\n**Upgrade to latest:**\n\n```bash\nbrew update\nbrew upgrade rcli\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eTroubleshooting: SHA256 mismatch or stale version\u003c/strong\u003e\u003c/summary\u003e\n\nIf `brew install` or `brew upgrade` fails with a checksum error:\n\n```bash\n# Force-refresh the tap to pick up the latest formula\ncd $(brew --repo RunanywhereAI/rcli) \u0026\u0026 git fetch origin \u0026\u0026 git reset --hard origin/main\nbrew reinstall rcli\n```\n\nIf that doesn't work, clean re-tap and clear the download cache:\n\n```bash\nbrew untap RunanywhereAI/rcli\nrm -rf \"$(brew --cache)/downloads/\"*rcli*\nbrew tap RunanywhereAI/rcli https://github.com/RunanywhereAI/RCLI.git\nbrew install rcli\nrcli setup\n```\n\n\u003c/details\u003e\n\n## Quick Start\n\n```bash\nrcli                             # interactive TUI (push-to-talk + text)\nrcli listen                      # continuous voice mode\nrcli ask \"open Safari\"           # one-shot command\nrcli ask \"play some jazz on Spotify\"\nrcli vlm photo.jpg \"what's in this image?\"  # vision analysis\nrcli camera                      # live camera VLM\nrcli screen                      # screen capture VLM\nrcli metalrt                     # MetalRT GPU engine management\nrcli llamacpp                    # llama.cpp engine management\n```\n\n\n## Benchmarks\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/decode-vs-llamacpp.webp\" alt=\"MetalRT vs llama.cpp decode speed\" width=\"700\" /\u003e\n  \u003cbr\u003e\n  \u003cem\u003eMetalRT decode throughput vs llama.cpp and Apple MLX on Apple M3 Max\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/rtf_comparison.webp\" alt=\"STT and TTS real-time factor comparison\" width=\"700\" /\u003e\n  \u003cbr\u003e\n  \u003cem\u003eSTT and TTS real-time factor — lower is better. MetalRT STT is 714x faster than real-time.\u003c/em\u003e\n\u003c/p\u003e\n\nFor More info : \n- https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-engine-apple-silicon\n- https://www.runanywhere.ai/blog/metalrt-speech-fastest-stt-tts-apple-silicon\n- https://www.runanywhere.ai/blog/fastvoice-on-device-voice-ai-pipeline-apple-silicon\n\n## Features\n\n### Voice Pipeline\n\nA full STT + LLM + TTS pipeline running on Metal GPU with three concurrent threads:\n\n- **VAD** — Silero voice activity detection\n- **STT** — Zipformer streaming + Whisper / Parakeet offline\n- **LLM** — Qwen3 / LFM2 / Qwen3.5 with KV cache continuation and Flash Attention\n- **TTS** — Double-buffered sentence-level synthesis (next sentence renders while current plays)\n- **Tool Calling** — LLM-native tool call formats (Qwen3, LFM2, etc.)\n- **Multi-turn Memory** — Sliding window conversation history with token-budget trimming\n\n### Vision (VLM)\n\nAnalyze images, camera captures, and screen regions using on-device vision-language models. VLM runs on the llama.cpp engine via Metal GPU — no cloud.\n\n- **Image Analysis** — `rcli vlm photo.jpg \"describe this\"` for single-image queries\n- **Camera** — Press **V** in the TUI or run `rcli camera` for live camera analysis\n- **Screen Capture** — Press **S** in the TUI or run `rcli screen` to analyze screen regions\n- **Models** — Qwen3 VL 2B, Liquid LFM2 VL 1.6B, SmolVLM 500M — download on demand via `rcli models vlm`\n\n\u003e **Note:** VLM is currently available on the llama.cpp engine. MetalRT VLM support is coming soon.\n\n### 40 macOS Actions\n\nControl your Mac by voice or text. The LLM routes intent to actions executed locally via AppleScript and shell commands.\n\n| Category | Examples |\n|----------|---------|\n| **Productivity** | `create_note`, `create_reminder`, `run_shortcut` |\n| **Communication** | `send_message`, `facetime_call` |\n| **Media** | `play_on_spotify`, `play_apple_music`, `play_pause`, `next_track`, `set_music_volume` |\n| **System** | `open_app`, `quit_app`, `set_volume`, `toggle_dark_mode`, `screenshot`, `lock_screen` |\n| **Web** | `search_web`, `search_youtube`, `open_url`, `open_maps` |\n\nRun `rcli actions` to see all 40, or toggle them on/off in the TUI Actions panel.\n\n\u003e **Tip:** If tool calling feels unreliable, press **X** in the TUI to clear the conversation and reset context. With small LLMs, accumulated context can degrade tool-calling accuracy — a fresh context often fixes it.\n\n### RAG (Local Document Q\u0026A)\n\nIndex local documents, query them by voice. Hybrid vector + BM25 retrieval with ~4ms latency over 5K+ chunks. Supports PDF, DOCX, and plain text.\n\n```bash\nrcli rag ingest ~/Documents/notes\nrcli ask --rag ~/Library/RCLI/index \"summarize the project plan\"\n```\n\n### Interactive TUI\n\nA terminal dashboard with push-to-talk, live hardware monitoring, model management, and an actions browser.\n\n| Key | Action |\n|-----|--------|\n| **SPACE** | Push-to-talk |\n| **V** | Camera — capture and analyze with VLM |\n| **S** | Screen — capture and analyze a screen region with VLM |\n| **M** | Models — browse, download, hot-swap LLM/STT/TTS/VLM |\n| **A** | Actions — browse, enable/disable macOS actions |\n| **R** | RAG — ingest documents |\n| **X** | Clear conversation and reset context |\n| **T** | Toggle tool call trace |\n| **ESC** | Stop / close / quit |\n\n## MetalRT GPU Engine\n\nMetalRT is a high-performance GPU inference engine built by [RunAnywhere, Inc.](https://runanywhere.ai) specifically for Apple Silicon. It delivers the fastest on-device inference for LLM, STT, and TTS — up to **550 tok/s** LLM throughput and sub-200ms end-to-end voice latency.\n\n\u003e **Apple M3 or later required.** MetalRT uses Metal 3.1 GPU features available on M3, M3 Pro, M3 Max, M4, and later chips. M1/M2 support is coming soon. On M1/M2, RCLI automatically falls back to the open-source llama.cpp engine.\n\nMetalRT is automatically installed during `rcli setup` (choose \"MetalRT\" or \"Both\"). Or install separately:\n\n```bash\nrcli metalrt install\nrcli metalrt status\n```\n\n**Supported models:** Qwen3 0.6B, Qwen3 4B, Llama 3.2 3B, LFM2.5 1.2B (LLM) · Whisper Tiny/Small/Medium (STT) · Kokoro 82M with 28 voices (TTS)\n\nMetalRT is distributed under a [proprietary license](https://github.com/RunanywhereAI/metalrt-binaries/blob/main/LICENSE). For licensing inquiries: founder@runanywhere.ai\n\n## Supported Models\n\nRCLI supports 20+ models across LLM, STT, TTS, VLM, VAD, and embeddings. All run locally on Apple Silicon. Use `rcli models` to browse, download, or switch.\n\n**LLM:** LFM2 1.2B (default), LFM2 350M, LFM2.5 1.2B, LFM2 2.6B, Qwen3 0.6B, Qwen3.5 0.8B/2B/4B, Qwen3 4B\n\n**STT:** Zipformer (streaming), Whisper base.en (offline, default), Parakeet TDT 0.6B (~1.9% WER)\n\n**TTS:** Piper Lessac/Amy, KittenTTS Nano, Matcha LJSpeech, Kokoro English/Multi-lang\n\n**VLM:** Qwen3 VL 2B, Liquid LFM2 VL 1.6B, SmolVLM 500M — on-demand download via `rcli models vlm` (llama.cpp engine only)\n\n**Default install** (`rcli setup`): ~1GB — LFM2 1.2B + Whisper + Piper + Silero VAD + Snowflake embeddings. VLM models are downloaded on demand.\n\n```bash\nrcli models                  # interactive model management\nrcli models vlm              # download/manage VLM models\nrcli upgrade-llm             # guided LLM upgrade\nrcli voices                  # browse and switch TTS voices\nrcli cleanup                 # remove unused models\n```\n\n## Build from Source\n\nCPU-only build using llama.cpp + sherpa-onnx (no MetalRT):\n\n```bash\ngit clone https://github.com/RunanywhereAI/RCLI.git \u0026\u0026 cd RCLI\nbash scripts/setup.sh\nbash scripts/download_models.sh\nmkdir -p build \u0026\u0026 cd build\ncmake .. -DCMAKE_BUILD_TYPE=Release\ncmake --build . -j$(sysctl -n hw.ncpu)\n./rcli\n```\n\nAll dependencies are vendored or CMake-fetched. Requires CMake 3.15+ and Apple Clang (C++17).\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eCLI Reference\u003c/strong\u003e\u003c/summary\u003e\n\n```\nrcli                          Interactive TUI (push-to-talk + text + trace)\nrcli listen                   Continuous voice mode\nrcli ask \u003ctext\u003e               One-shot text command\nrcli vlm \u003cimage\u003e [prompt]     Analyze an image with VLM\nrcli camera [prompt]          Live camera capture + VLM analysis\nrcli screen [prompt]          Screen capture + VLM analysis\nrcli actions [name]           List actions or show detail\nrcli rag ingest \u003cdir\u003e         Index documents for RAG\nrcli rag query \u003ctext\u003e         Query indexed documents\nrcli models [llm|stt|tts|vlm] Manage AI models\nrcli voices                   Manage TTS voices\nrcli metalrt                  MetalRT GPU engine management\nrcli llamacpp                 llama.cpp engine management\nrcli setup                    Download default models\nrcli info                     Show engine and model info\n\nOptions:\n  --models \u003cdir\u003e      Models directory (default: ~/Library/RCLI/models)\n  --rag \u003cindex\u003e       Load RAG index for document-grounded answers\n  --gpu-layers \u003cn\u003e    GPU layers for LLM (default: 99 = all)\n  --ctx-size \u003cn\u003e      LLM context size (default: 4096)\n  --no-speak          Text output only (no TTS)\n  --verbose, -v       Debug logs\n```\n\n\u003c/details\u003e\n\n## Contributing\n\nContributions welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for build instructions and how to add new actions, models, or voices.\n\n## License\n\nRCLI is open source under the [MIT License](LICENSE).\n\nMetalRT is proprietary software by [RunAnywhere, Inc.](https://runanywhere.ai), distributed under a separate [license](https://github.com/RunanywhereAI/metalrt-binaries/blob/main/LICENSE).\n\n\u003cp align=\"center\"\u003e\n  Built by \u003ca href=\"https://www.runanywhere.ai\"\u003eRunAnywhere, Inc.\u003c/a\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRunanywhereAI%2FRCLI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FRunanywhereAI%2FRCLI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRunanywhereAI%2FRCLI/lists"}