{"id":51083669,"url":"https://github.com/sumitaich1998/jarvisvr","last_synced_at":"2026-06-23T20:32:31.866Z","repository":{"id":363837281,"uuid":"1264841898","full_name":"sumitaich1998/jarvisvr","owner":"sumitaich1998","description":"An AI agentic operating system for mixed reality on the Meta Quest 3 — your own J.A.R.V.I.S. Multi-agent orchestration, multimodal perception (sight/hearing/gaze), 42 holographic widgets, 20 LLM providers.","archived":false,"fork":false,"pushed_at":"2026-06-10T14:20:41.000Z","size":7475,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-10T16:13:36.988Z","etag":null,"topics":["agent-skills","agentic-ai","ai-agent","holographic-ui","jarvis","llm","meta-quest-3","mixed-reality","multi-agent-systems","multimodal","openxr","passthrough","unity","voice-assistant"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sumitaich1998.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":null},"created_at":"2026-06-10T08:22:01.000Z","updated_at":"2026-06-10T14:33:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sumitaich1998/jarvisvr","commit_stats":null,"previous_names":["sumitaich1998/jarvisvr"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/sumitaich1998/jarvisvr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sumitaich1998%2Fjarvisvr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sumitaich1998%2Fjarvisvr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sumitaich1998%2Fjarvisvr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sumitaich1998%2Fjarvisvr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sumitaich1998","download_url":"https://codeload.github.com/sumitaich1998/jarvisvr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sumitaich1998%2Fjarvisvr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34706579,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-23T02:00:07.161Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-skills","agentic-ai","ai-agent","holographic-ui","jarvis","llm","meta-quest-3","mixed-reality","multi-agent-systems","multimodal","openxr","passthrough","unity","voice-assistant"],"created_at":"2026-06-23T20:32:30.036Z","updated_at":"2026-06-23T20:32:31.853Z","avatar_url":"https://github.com/sumitaich1998.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n\n\n# JarvisVR\n\n### Your own **J.A.R.V.I.S.** — an AI agentic operating system for mixed reality, running on the Meta Quest 3 VR headset.\n\n*Speak to the room around you. An LLM agent plans, calls tools, and materializes interactive 3D holograms you can grab with your hands — and it can **see** and **hear** your real world in realtime.*\n\n  \n\n\n[CI](https://github.com/sumitaich1998/jarvisvr/actions/workflows/ci.yml)\n[License: MIT](./LICENSE)\n[PRs Welcome](./CONTRIBUTING.md)\n[Protocol](./docs/PROTOCOL.md)\n[Python](./agent-backend)\n[Unity](./unity-client)\n[Meta Quest 3](./unity-client)\n\n[Stars](https://github.com/sumitaich1998/jarvisvr/stargazers)\n[Forks](https://github.com/sumitaich1998/jarvisvr/network/members)\n\n**[Features](#-highlights) · [Architecture](#%EF%B8%8F-architecture) · [Quickstart](#-quickstart) · [Docs](#-documentation) · [Contributing**](#-contributing)\n\n---\n\n## What is JarvisVR?\n\nJarvisVR turns the space around you into a programmable, conversational computer. Say *\"Jarvis, show\nme the weather in Tokyo and start a 5-minute timer\"* and a planning **LLM agent** decides what to do,\ncalls tools, and spawns **interactive holograms** into your room (Quest 3 passthrough mixed reality)\nthat you can grab, press, resize, and arrange with your hands.\n\nIt doesn't just listen for commands — it **perceives**. JarvisVR streams the Quest 3 **color\npassthrough camera**, **ambient room audio**, and **gaze** to a vision-capable agent, so you can ask\nabout whatever you're looking at or hearing: *\"what is this on my desk?\"*, *\"read this sign and\ntranslate it\"*, *\"what was that sound?\"* — realtime multimodal input (**sight + sound + gaze +\nvoice**).\n\n```\n        🗣️  \"Jarvis, ...\"            🧠 plan + tools            ✋ interact\n   You ───────────────▶ Voice ──▶ Agent Backend ──▶ Holograms ──▶ You\n                         (STT)      (LLM + tools)     (Quest 3 MR)\n   You ◀─────────────── Voice ◀──  (TTS speech)  ◀── render cmds\n        👁️  sees · 👂 hears · 🎯 gaze  ──▶  realtime multimodal perception\n```\n\n\u003e [!NOTE]\n\u003e **Early, honest scaffold.** Every component runs **fully offline** out of the box via deterministic\n\u003e mock providers, and all of them talk over one versioned WebSocket protocol. The Unity client is a\n\u003e complete project you build locally with Unity + the Meta XR SDK — there is **no prebuilt APK**, and\n\u003e the UI imagery below is a **concept mockup**.\n\n## ✨ Highlights\n\n- 👁️ **Color-passthrough vision** — streams the Quest 3 forward RGB camera (Meta Passthrough Camera API) to a vision LLM, pull-based at 1–3 fps.\n- 👂 **Ambient hearing + sound events** — continuous room-audio understanding (overheard speech, soundscape) and event detection (doorbell, alarm, glass break…) beyond the wake word.\n- 🎯 **Gaze \u0026 attention** — eye/head ray tells the agent *what* you're looking at.\n- 🧠 **Agentic tool-calling brain** — an LLM that *plans → calls tools → observes → responds*, not a fixed command parser.\n- 🪟 **42 holographic widgets** — weather orbs, 3D charts, model viewers, maps, smart-home panels, live captions, vision annotations, and more — all schema-validated.\n- 🔌 **20 LLM providers** — OpenAI, Anthropic, Gemini, Groq, local Ollama/LM Studio/vLLM, … with an **install-time key wizard** *and* **in-headset key config**.\n- 🎙️ **Voice + barge-in** — wake word \"Jarvis\", streaming STT, natural TTS, and interruptibility (talk over Jarvis to cancel a turn).\n- 🛜 **Offline mock mode** — the entire stack is demoable with **no API key and no hardware**.\n- 📜 **Open protocol** — a clean, versioned WebSocket contract (v1.1) with Python, C#, and TypeScript bindings, so you can swap the model, the voice, or even the rendering engine.\n\n## 🖼️ Gallery\n\n\n\n\n\n**Concept mockup** — holographic widgets anchored in a real room (illustrative, not a screenshot of a shipped build).\n\n\n\n## 🏗️ Architecture\n\nA headset **shell** (rendering + input) and an AI **brain** (reasoning) are fully decoupled across one\nversioned WebSocket protocol, so either side can be replaced.\n\n\n\n\n\n\n\n**Mermaid fallback (component map)**\n\n```mermaid\nflowchart LR\n    subgraph Headset[\"Meta Quest 3 · unity-client/\"]\n        PT[Passthrough MR + holograms]\n        IN[Hands · gaze · mic]\n        WC[WebSocket client]\n    end\n    subgraph Server[\"Host / cloud · Docker\"]\n        VS[voice-service/\u003cbr/\u003ewake word · STT · TTS · ambient hearing]\n        AB[agent-backend/\u003cbr/\u003eLLM agent · planner · tools · perception buffer]\n        HT[holo-tools/\u003cbr/\u003e42-widget catalog · tool schemas]\n    end\n    SP[(shared-protocol/\u003cbr/\u003ePy · C# · TS bindings)]\n\n    IN --\u003e WC\n    WC \u003c-- \"holo.* / agent.* / user.* / perception.*\" --\u003e AB\n    WC \u003c-- \"audio / vision\" --\u003e VS\n    VS \u003c--\u003e AB\n    AB \u003c--\u003e HT\n    SP -. defines messages .- WC\n    SP -. defines messages .- AB\n    SP -. defines messages .- VS\n```\n\n\n\n\n\nSee `**[ARCHITECTURE.md](./ARCHITECTURE.md)**` for the full system design and data flow.\n\n## 🚀 Quickstart\n\n\u003e Full setup lives in each component's README and in `[infra/](./infra)`. The whole stack runs\n\u003e **offline** on the deterministic `mock` provider, so you can try it with no API key.\n\n### Option A — Install the brain with the key wizard\n\n```bash\ncd infra \u0026\u0026 make install\n```\n\nThis installs the `agent-backend` and runs an interactive wizard that lets you **pick a provider**\n(OpenAI, Anthropic, Gemini, Groq, Ollama/local, … or `**mock`** for fully offline) and **enter your\nAPI key** (masked input). The key is saved to `agent-backend/.env` (`chmod 600`) and **never\nprinted**. Then run it:\n\n```bash\ncd ../agent-backend \u0026\u0026 source .venv/bin/activate \u0026\u0026 python -m jarvis_backend\n# -\u003e ws://0.0.0.0:8765/jarvis   (offline on the mock provider if you skipped the key)\n```\n\nRe-run the wizard anytime with `jarvis-backend setup`; list providers with `jarvis-backend providers`.\n\n### Option B — Docker (mock brain, no key needed)\n\n```bash\ncd infra \u0026\u0026 docker compose up --build\n```\n\n### Then: open the Quest 3 client\n\nOpen `[unity-client/](./unity-client)` in **Unity 2022 LTS** with the **Meta XR SDK**, point its\n`JarvisConfig` at your backend host, and **Build \u0026 Run** to a Quest 3 (or test in the editor over\nQuest Link). Full steps in the [unity-client README](./unity-client/README.md).\n\n## 🔌 Supported LLM providers\n\nPick any provider at install time *or* hot-swap it from the in-headset Settings panel — keys are\nstored in `.env` (`chmod 600`) and never logged or echoed. Run `jarvis-backend providers` for the\nlive list.\n\n\n|     | Provider               | Notes                                                 |\n| --- | ---------------------- | ----------------------------------------------------- |\n| 🧪  | **Mock**               | Deterministic, offline, no key — the default.         |\n| 🟢  | **OpenAI**             | `gpt-4o`, `gpt-4o-mini`, `o4-mini` (native SDK).      |\n| 🟣  | **Anthropic (Claude)** | `claude-3.5-sonnet`, `claude-3.5-haiku` (native SDK). |\n| 🔵  | **Google Gemini**      | OpenAI-compatible endpoint.                           |\n| ☁️  | **Google Vertex AI**   | via LiteLLM (Google ADC).                             |\n| 🟦  | **Azure OpenAI**       | via LiteLLM (set base URL).                           |\n| 🟧  | **AWS Bedrock**        | via LiteLLM (AWS creds).                              |\n| 🟠  | **Mistral AI**         | OpenAI-compatible.                                    |\n| 🔶  | **Cohere**             | via LiteLLM.                                          |\n| ⚡   | **Groq**               | OpenAI-compatible, very fast.                         |\n| 🤝  | **Together AI**        | OpenAI-compatible.                                    |\n| 🧭  | **OpenRouter**         | OpenAI-compatible meta-router.                        |\n| 🐳  | **DeepSeek**           | OpenAI-compatible.                                    |\n| ✖️  | **xAI (Grok)**         | OpenAI-compatible (vision).                           |\n| 🔎  | **Perplexity**         | OpenAI-compatible (online).                           |\n| 🎆  | **Fireworks AI**       | OpenAI-compatible.                                    |\n| 🦙  | **Ollama (local)**     | Local, usually no key.                                |\n| 🖥️ | **LM Studio (local)**  | Local OpenAI-compatible server.                       |\n| 🚄  | **vLLM (self-hosted)** | Self-hosted OpenAI-compatible.                        |\n| 🛠️ | **Custom**             | Any OpenAI-compatible `base_url`.                     |\n\n\nReached natively, via a generic OpenAI-compatible client, or through the **LiteLLM** universal adapter.\n\n## 🗂️ Repository layout\n\n\n| Path                                    | Component                                              | Role                                              |\n| --------------------------------------- | ------------------------------------------------------ | ------------------------------------------------- |\n| `[unity-client/](./unity-client)`       | Quest 3 Unity MR app (the \"shell\")                     | Renders holograms, captures hands/gaze/camera/mic |\n| `[agent-backend/](./agent-backend)`     | LLM agent orchestration server                         | The \"brain\": plan → tools → perception → render   |\n| `[voice-service/](./voice-service)`     | Wake word + STT + TTS + ambient hearing                | Ears \u0026 mouth                                      |\n| `[holo-tools/](./holo-tools)`           | 42-widget catalog + agent tool schemas                 | What Jarvis can show                              |\n| `[shared-protocol/](./shared-protocol)` | Protocol bindings (Python / C# / TypeScript)           | One contract, three languages                     |\n| `[infra/](./infra)`                     | Docker Compose, dev scripts, mock backend, e2e harness | Glue \u0026 conformance                                |\n| `[docs/](./docs)`                       | Architecture, protocol, features, widget catalog       | Source-of-truth contracts                         |\n\n\n## 🗺️ Roadmap\n\nDriven by `[docs/FEATURES.md](./docs/FEATURES.md)` (priorities **P0** = now, **P1** = next, **P2** = later).\n\n- **✅ Now (P0)** — multimodal perception (vision · hearing · sound events · gaze), agentic planning + tool-calling, the 42-widget catalog, 20 LLM providers with install-time + in-headset key config, voice with barge-in, pull-based privacy-gated perception, and a fully offline mock for every capability.\n- **🔜 Next (P1)** — spatial + episodic memory (\"where did I leave my keys?\"), session persistence \u0026 multi-window spatial OS shell, more integrations (smart home, calendar, music, maps, web search, stocks/news), routines/macros, and real-time translation.\n- **🌟 Later (P2)** — multi-user shared sessions, device hand-off, speaker diarization \u0026 voice biometrics, generative 3D asset creation, and a phone-notification mirror.\n\n## 📚 Documentation\n\n- `[Documentation hub](./docs/README.md)` — **start here**: getting-started, concepts, component deep-dives, guides \u0026 reference.\n- `[ARCHITECTURE.md](./ARCHITECTURE.md)` — system design, components, and data flow.\n- `[docs/PROTOCOL.md](./docs/PROTOCOL.md)` — the WebSocket message contract (the source of truth every component conforms to).\n- `[docs/HOLO_TOOLS.md](./docs/HOLO_TOOLS.md)` — the holographic widget catalog \u0026 agent tool schemas.\n- `[docs/FEATURES.md](./docs/FEATURES.md)` — the full feature set \u0026 roadmap (incl. perception).\n- `[docs/PUBLISHING.md](./docs/PUBLISHING.md)` — the ship checklist for launching this repo publicly.\n- `[CHANGELOG.md](./CHANGELOG.md)` — notable changes, per release.\n\n## 🤝 Contributing\n\nContributions are very welcome! JarvisVR is built **protocol-first**: every change conforms to\n`[docs/PROTOCOL.md](./docs/PROTOCOL.md)`, and the `[infra/](./infra)` e2e harness validates the whole\nstack against the mock backend.\n\n- 📖 Read `**[CONTRIBUTING.md](./CONTRIBUTING.md)`** for per-component dev setup, how to run each test suite, and the PR process.\n- 🐛 File a [bug report or feature request](https://github.com/sumitaich1998/jarvisvr/issues/new/choose).\n- 🔒 Found a vulnerability? See `**[SECURITY.md](./SECURITY.md)**`.\n- 🙌 Please be kind — we follow the **[Contributor Covenant](./CODE_OF_CONDUCT.md)**.\n\n## 🌐 Community \u0026 support\n\n- 💬 [GitHub Discussions](https://github.com/sumitaich1998/jarvisvr/discussions) — ideas, questions, show \u0026 tell.\n- 🐞 [Issues](https://github.com/sumitaich1998/jarvisvr/issues) — bugs and feature requests.\n- ⭐ Star the repo to follow along!\n\n## 📄 License\n\nReleased under the **[MIT License](./LICENSE)** © 2026 The JarvisVR contributors. Cite it via\n`[CITATION.cff](./CITATION.cff)`.\n\n## ⚠️ Disclaimer\n\nJarvisVR is an independent project and is **not affiliated with, endorsed by, or sponsored by** Marvel,\nMeta, or any third party. \"J.A.R.V.I.S.\" is referenced only as cultural inspiration. Product names and\ntrademarks belong to their respective owners.\n\n---\n\n\n\n**If JarvisVR sparks your imagination, consider giving it a ⭐ — it genuinely helps.**\n\nBuilt in the open. [Come build the spatial AI future with us.](./CONTRIBUTING.md)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsumitaich1998%2Fjarvisvr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsumitaich1998%2Fjarvisvr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsumitaich1998%2Fjarvisvr/lists"}