{"id":47190363,"url":"https://github.com/huseyinstif/oculos","last_synced_at":"2026-03-13T10:35:51.689Z","repository":{"id":342892125,"uuid":"1171774239","full_name":"huseyinstif/oculos","owner":"huseyinstif","description":"If it's on the screen, it's an API. Control any desktop app via REST + MCP. Rust.","archived":false,"fork":false,"pushed_at":"2026-03-07T22:25:56.000Z","size":1621,"stargazers_count":40,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-03-08T02:38:22.830Z","etag":null,"topics":["accessibility","ai-agents","desktop-automation","mcp","oculos","rest-api","rust","ui-automation"],"latest_commit_sha":null,"homepage":"https://github.com/huseyinstif/oculos","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/huseyinstif.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-03T15:41:07.000Z","updated_at":"2026-03-08T02:01:25.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/huseyinstif/oculos","commit_stats":null,"previous_names":["huseyinstif/oculos"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/huseyinstif/oculos","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huseyinstif%2Foculos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huseyinstif%2Foculos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huseyinstif%2Foculos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huseyinstif%2Foculos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/huseyinstif","download_url":"https://codeload.github.com/huseyinstif/oculos/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huseyinstif%2Foculos/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30465479,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-13T06:34:02.089Z","status":"ssl_error","status_checked_at":"2026-03-13T06:33:49.182Z","response_time":60,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","ai-agents","desktop-automation","mcp","oculos","rest-api","rust","ui-automation"],"created_at":"2026-03-13T10:35:51.158Z","updated_at":"2026-03-13T10:35:51.670Z","avatar_url":"https://github.com/huseyinstif.png","language":"Rust","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/logo.svg\" width=\"100\" alt=\"OculOS\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eOculOS\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eIf it's on the screen, it's an API.\u003c/strong\u003e\u003cbr/\u003e\n  \u003csub\u003eControl any desktop app through JSON. REST API + MCP server. Single binary. Zero dependencies.\u003c/sub\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e •\n  \u003ca href=\"#how-it-works\"\u003eHow It Works\u003c/a\u003e •\n  \u003ca href=\"#api\"\u003eAPI\u003c/a\u003e •\n  \u003ca href=\"#client-sdks\"\u003eSDKs\u003c/a\u003e •\n  \u003ca href=\"#mcp-setup\"\u003eMCP Setup\u003c/a\u003e •\n  \u003ca href=\"#dashboard\"\u003eDashboard\u003c/a\u003e •\n  \u003ca href=\"./examples\"\u003eExamples\u003c/a\u003e •\n  \u003ca href=\"./openapi.yaml\"\u003eAPI Spec\u003c/a\u003e •\n  \u003ca href=\"./CHANGELOG.md\"\u003eChangelog\u003c/a\u003e •\n  \u003ca href=\"./CONTRIBUTING.md\"\u003eContributing\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"MIT License\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/huseyinstif/oculos/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/huseyinstif/oculos?style=social\" alt=\"GitHub Stars\" /\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/built_with-Rust-dea584.svg\" alt=\"Built with Rust\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/platform-Windows%20%7C%20Linux%20%7C%20macOS-informational\" alt=\"Platforms\" /\u003e\n\u003c/p\u003e\n\n---\n\nOculOS is a lightweight daemon that reads the OS accessibility tree and exposes every button, text field, checkbox, and menu item as a JSON endpoint. It works as a **REST API** for scripts, testing, and CI/CD — and as an **MCP server** for AI agents like Claude, Cursor, and Windsurf.\n\nNo screenshots. No pixel coordinates. No browser extensions. No code injection. No AI required. Just structured JSON.\n\n---\n\n### Demo — Claude Code + OculOS → Calculator (5×5=25)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/demo.gif\" width=\"720\" alt=\"Claude Code using OculOS MCP to open Calculator and compute 5×5\" /\u003e\n\u003c/p\u003e\n\n\u003csub\u003eClaude Code uses OculOS MCP tools to open Calculator, find buttons, click 5 × 5 =, and read the result — fully autonomous.\u003c/sub\u003e\n\n### Claude Code + OculOS → Spotify\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/demo-mcp.png\" width=\"720\" alt=\"Claude Code controlling Spotify through OculOS MCP\" /\u003e\n\u003c/p\u003e\n\n\u003csub\u003eClaude Code uses OculOS MCP tools to find Spotify, focus it, search for a song, and play it — fully autonomous.\u003c/sub\u003e\n\n### Web Dashboard\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/demo-dashboard.png\" width=\"720\" alt=\"OculOS Dashboard — element tree inspector\" /\u003e\n\u003c/p\u003e\n\n\u003csub\u003eBuilt-in dashboard with window list, interactive element tree, inspector, recorder, and live WebSocket events.\u003c/sub\u003e\n\n---\n\n## Quick Start\n\n```bash\ngit clone https://github.com/huseyinstif/oculos.git\ncd oculos\ncargo build --release\n```\n\n### macOS: Grant Accessibility Permission\n\nOculOS reads the OS accessibility tree, so macOS requires you to grant permission:\n\n1. Open **System Settings → Privacy \u0026 Security → Accessibility**\n2. Click the **lock icon** and enter your password\n3. Click **+** and add your terminal app (Terminal, iTerm2, Windsurf, etc.) or the `oculos` binary itself\n4. Make sure the toggle is **enabled**\n\n\u003e Without this permission, OculOS can list windows but cannot read UI elements or interact with them.\n\n### HTTP mode (API + Dashboard)\n\n```bash\n./target/release/oculos\n# API       → http://127.0.0.1:7878\n# Dashboard → http://127.0.0.1:7878\n```\n\n### MCP mode (for AI agents)\n\n```bash\n./target/release/oculos --mcp\n```\n\n---\n\n## How It Works\n\nOculOS reads the OS accessibility tree and assigns each UI element a session-scoped UUID (`oculos_id`). You use that ID to interact.\n\n```bash\n# 1. List open windows\ncurl http://localhost:7878/windows\n\n# 2. Get the UI tree for a window\ncurl http://localhost:7878/windows/{pid}/tree\n\n# 3. Find a specific element\ncurl \"http://localhost:7878/windows/{pid}/find?q=Submit\u0026type=Button\"\n\n# 4. Click it\ncurl -X POST http://localhost:7878/interact/{id}/click\n\n# 5. Type into a text field\ncurl -X POST http://localhost:7878/interact/{id}/set-text \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\":\"hello world\"}'\n```\n\nEvery element includes an `actions` array — the API tells you exactly what you can do:\n\n```json\n{\n  \"oculos_id\": \"a3f8c2d1-...\",\n  \"type\": \"Button\",\n  \"label\": \"Submit\",\n  \"enabled\": true,\n  \"actions\": [\"click\", \"focus\"],\n  \"rect\": { \"x\": 120, \"y\": 340, \"width\": 80, \"height\": 32 }\n}\n```\n\n---\n\n## API\n\n### Discovery\n\n| Endpoint | Description |\n|----------|-------------|\n| `GET /windows` | List all visible windows |\n| `GET /windows/{pid}/tree` | Full UI element tree |\n| `GET /windows/{pid}/find?q=\u0026type=\u0026interactive=` | Search elements |\n| `GET /hwnd/{hwnd}/tree` | Tree by window handle |\n| `GET /hwnd/{hwnd}/find` | Search by window handle |\n\n### Window operations\n\n| Endpoint | Description |\n|----------|-------------|\n| `POST /windows/{pid}/focus` | Bring to foreground |\n| `POST /windows/{pid}/close` | Close gracefully |\n| `GET /windows/{pid}/wait?q=\u0026type=\u0026timeout=` | Wait for element to appear (polls, default 5s) |\n| `GET /windows/{pid}/screenshot` | Capture window as PNG |\n\n### Element interactions\n\n| Endpoint | Body | Description |\n|----------|------|-------------|\n| `POST /interact/{id}/click` | — | Click |\n| `POST /interact/{id}/set-text` | `{\"text\":\"…\"}` | Replace text content |\n| `POST /interact/{id}/send-keys` | `{\"keys\":\"…\"}` | Keyboard input |\n| `POST /interact/{id}/focus` | — | Move focus |\n| `POST /interact/{id}/toggle` | — | Toggle checkbox |\n| `POST /interact/{id}/expand` | — | Expand dropdown / tree |\n| `POST /interact/{id}/collapse` | — | Collapse |\n| `POST /interact/{id}/select` | — | Select list item |\n| `POST /interact/{id}/set-range` | `{\"value\":N}` | Set slider value |\n| `POST /interact/{id}/scroll` | `{\"direction\":\"…\"}` | Scroll container |\n| `POST /interact/{id}/scroll-into-view` | — | Scroll into viewport |\n| `POST /interact/{id}/highlight` | `{\"duration_ms\":N}` | Highlight on screen |\n| `POST /interact/batch` | `{\"actions\":[...]}` | Execute multiple interactions |\n\n### System\n\n| Endpoint | Description |\n|----------|-------------|\n| `GET /health` | Status, version, uptime |\n| `GET /ws` | WebSocket (live action events) |\n\n---\n\n## MCP Setup\n\nWorks with any MCP-compatible client. Add to your config:\n\n```json\n{\n  \"mcpServers\": {\n    \"oculos\": {\n      \"command\": \"/path/to/oculos\",\n      \"args\": [\"--mcp\"]\n    }\n  }\n}\n```\n\n**Tested with:** Claude Code, Claude Desktop, Cursor, Windsurf\n\nFor non-MCP agents (OpenAI, Gemini, custom), paste [`AGENTS.md`](./AGENTS.md) into the system prompt and give the agent HTTP access.\n\n---\n\n## Dashboard\n\nBuilt-in web UI at `http://127.0.0.1:7878`:\n\n- **Window list** — all open windows with focus/close buttons\n- **Element tree** — full interactive UI tree with search and filter\n- **Inspector** — element details, properties, and all available actions\n- **Recorder** — record a sequence of interactions, export as **Python**, **JavaScript**, or **curl**\n- **JSON viewer** — raw element data with copy\n- **WebSocket** — live event indicator, real-time action feed\n- **Shortcuts** — `R` refresh · `/` search · `E` expand · `C` collapse · `H` highlight · `J` JSON\n\n---\n\n## Platform Support\n\n| Platform | Backend | Status |\n|----------|---------|--------|\n| **Windows** | UI Automation (`windows-rs`) | ✅ Full — Win32, WPF, Electron, Qt |\n| **Linux** | AT-SPI2 (`atspi` + `zbus`) | ✅ Working — GTK, Qt, Electron |\n| **macOS** | Accessibility API (`AXUIElement` + CoreGraphics) | ✅ Working — Cocoa, Electron, Qt |\n\n### App Compatibility\n\n| App type | Coverage | Notes |\n|----------|----------|-------|\n| **Win32 / WPF / WinForms** | Excellent | Full deep tree, all interactions |\n| **GTK / Qt** | Excellent | Full tree on all platforms |\n| **Electron** (Spotify, VS Code, Slack, Chrome) | Good | Key interactive elements exposed; tree is shallower than native |\n| **Cocoa** (macOS native) | Good | Standard controls fully exposed |\n| **Custom-drawn / OpenGL / DirectX** | Poor | Minimal or no accessibility tree — games, CAD, etc. |\n\n\u003e **Tip:** Run `curl \"localhost:7878/windows/{pid}/find?interactive=true\"` to see what's available for any app.\n\n---\n\n## Client SDKs\n\nOfficial wrappers for the REST API. Install from source (PyPI/npm packages coming soon):\n\n### Python\n\n```bash\ncd sdk/python\npip install .\n```\n\n```python\nfrom oculos import OculOS\n\nclient = OculOS()\nwindows = client.list_windows()\nclient.click(element_id)\nclient.set_text(element_id, \"hello world\")\n```\n\nSee [`sdk/python`](./sdk/python) for full docs.\n\n### TypeScript\n\n```bash\ncd sdk/typescript\nnpm install\nnpm run build\n```\n\n```typescript\nimport { OculOS } from \"./sdk/typescript/src/index\";\n\nconst client = new OculOS();\nconst windows = await client.listWindows();\nawait client.click(elementId);\nawait client.setText(elementId, \"hello world\");\n```\n\nSee [`sdk/typescript`](./sdk/typescript) for full docs.\n\n---\n\n## CLI\n\n```\noculos [OPTIONS]\n\n  -b, --bind \u003cADDR\u003e       Bind address [default: 127.0.0.1:7878]\n      --static-dir \u003cDIR\u003e  Static files directory [default: static]\n      --log \u003cLEVEL\u003e       Log level: trace/debug/info/warn/error [default: info]\n      --mcp               Run as MCP server over stdin/stdout\n  -h, --help              Print help\n```\n\n---\n\n## How OculOS Differs\n\n| | OculOS | Vision agents | Screen coordinate tools | Browser-only tools |\n|---|---|---|---|---|\n| **Approach** | OS accessibility tree | Screenshots + LLM | Pixel positions | DOM / a11y tree |\n| **Scope** | Any desktop app | Any (with latency) | Any (fragile) | Browser only |\n| **Speed** | Instant | Seconds | Instant | Instant |\n| **Deterministic** | ✅ | ❌ | ✅ | ✅ |\n| **GPU required** | ❌ | ✅ | ❌ | ❌ |\n| **Cloud required** | ❌ | Usually | ❌ | ❌ |\n| **Semantic** | ✅ Labels + types | Varies | ❌ Coordinates | ✅ |\n\n---\n\n## Everything Built So Far\n\n### Core\n- [x] Windows UIA backend (full — Win32, WPF, Electron, Qt)\n- [x] Linux AT-SPI2 backend\n- [x] macOS Accessibility backend (`AXUIElement`, CoreGraphics window enumeration, CGEvent keyboard simulation)\n- [x] REST API server (Axum)\n- [x] MCP server (JSON-RPC 2.0 over stdio)\n- [x] Session-scoped element registry with UUIDs\n- [x] Full keyboard simulation engine\n\n### Dashboard\n- [x] Window list with focus/close\n- [x] Interactive element tree with search/filter\n- [x] Element inspector with all actions\n- [x] API request log\n- [x] JSON viewer with copy\n- [x] Keyboard shortcuts\n\n### Advanced\n- [x] Element highlighting (native GDI overlay)\n- [x] Automation recorder (record + export Python/JS/curl)\n- [x] WebSocket live events\n- [x] Health endpoint (uptime, version, platform)\n\n### Planned\n- [ ] macOS element highlighting (native overlay)\n- [x] Python \u0026 TypeScript client SDKs\n- [x] Batch operations (multiple interactions per request)\n- [x] Conditional waits (`/wait` endpoint with timeout)\n- [x] Screenshot capture (`/screenshot` endpoint)\n- [x] GitHub Actions CI (Windows, Linux, macOS)\n- [x] Docker image for CI/CD\n- [x] OpenAPI spec\n- [x] `--version` CLI flag\n- [ ] Element caching \u0026 diffing (change detection)\n- [ ] PyPI / npm SDK publishing\n\n---\n\n## Contributing\n\nWe welcome contributions! See [CONTRIBUTING.md](./CONTRIBUTING.md) for details.\n\n**Top areas:**\n- **Tests** — cross-app integration tests\n- **macOS highlight** — native overlay for element highlighting\n- **Element caching** — change detection \u0026 diffing\n- **Documentation** — guides, examples, tutorials\n\n---\n\n## License\n\n[MIT](./LICENSE)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuseyinstif%2Foculos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhuseyinstif%2Foculos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuseyinstif%2Foculos/lists"}