{"id":51193110,"url":"https://github.com/joe223/sootie","last_synced_at":"2026-06-27T17:31:17.657Z","repository":{"id":355251384,"uuid":"1225782507","full_name":"joe223/sootie","owner":"joe223","description":"Sootie is a cross-platform computer-use for AI agents","archived":false,"fork":false,"pushed_at":"2026-06-26T07:30:11.000Z","size":94599,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-26T09:17:16.196Z","etag":null,"topics":["accessibility","agent","ai-agents","automation","claude-code","computer-use","linux","llm-tools","macos","mcp","opencode","recipes","windows"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joe223.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-30T16:26:26.000Z","updated_at":"2026-06-26T07:30:15.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/joe223/sootie","commit_stats":null,"previous_names":["joe223/sootie"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/joe223/sootie","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joe223%2Fsootie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joe223%2Fsootie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joe223%2Fsootie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joe223%2Fsootie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joe223","download_url":"https://codeload.github.com/joe223/sootie/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joe223%2Fsootie/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34862627,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-27T02:00:06.362Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","agent","ai-agents","automation","claude-code","computer-use","linux","llm-tools","macos","mcp","opencode","recipes","windows"],"created_at":"2026-06-27T17:31:16.799Z","updated_at":"2026-06-27T17:31:17.641Z","avatar_url":"https://github.com/joe223.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"logo.png\" alt=\"Sootie logo\" width=\"128\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eSootie\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  Cross-platform computer-use for agents that need the whole desktop, not just one browser tab.\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#quick-start\"\u003eQuick start\u003c/a\u003e\n  ·\n  \u003ca href=\"#tool-surface\"\u003e57 MCP tools\u003c/a\u003e\n  ·\n  \u003ca href=\"#recipes-and-learning\"\u003eRecipes\u003c/a\u003e\n  ·\n  \u003ca href=\"#runtime-check\"\u003eRuntime checks\u003c/a\u003e\n\u003c/p\u003e\n\nSootie is a Rust MCP runtime that gives any MCP-capable agent one computer-use\ncontract across macOS, Linux, and Windows. Use it from OpenCode, Claude Code,\nCodex, Cursor, VS Code, or your own agent runtime.\n\nThe agent keeps calling the same short tools, such as `find`, `click`, and\n`browser_open`, while Sootie chooses the best execution path underneath:\nbrowser DOM through CDP, native OS backends for real desktop state, and vision\ngrounding when structure runs out.\n\nTeach it a workflow once. Save it as a JSON recipe. Run it again from any\nagent.\n\n```bash\nsootie setup\nsootie serve\n```\n\n## Watch Sootie Work\n\n\u003c!-- Demo GIF placeholder:\n     Replace this block with the Safari + Excalidraw flower + recipe-recording GIF.\n     Suggested asset path: docs/assets/sootie-excalidraw-flower-recipe.gif\n--\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eDemo GIF placeholder: Safari + Excalidraw draws a colorful flower, then\n  records the workflow as a reusable Sootie recipe.\u003c/em\u003e\n\u003c/p\u003e\n\n## What Makes Sootie Different\n\nAgent frameworks move fast. Desktop automation APIs do not. Sootie makes that\nboundary stable.\n\n- Agent-neutral: any MCP-capable client can call the same Sootie tools.\n- Platform-neutral: macOS, Linux, and Windows share the same public MCP\n  contract while backend-specific mechanics stay below it.\n- Signal-aware: browser CDP first, native platform state second, vision\n  grounding last.\n- Workflow-aware: learning mode records successful desktop actions and recipes\n  replay them later.\n- Evidence-first: `sootie doctor`, structured tool reports, and full-suite smoke\n  docs make runtime readiness inspectable instead of assumed.\n\n## What Agents Can Do\n\n- Inspect the current desktop: apps, windows, URLs, focused elements, visible\n  text, screenshots, and interactive elements.\n- Act on apps and pages: click, type, press keys, hotkeys, scroll, hover,\n  long-press, drag, focus windows, and manage window geometry.\n- Use CDP for browser content when Chrome or Edge exposes a remote debugging\n  endpoint, without adding a separate browser-only tool family.\n- Fall back to vision grounding for described targets, including annotated JPG\n  history under `/tmp/sootie/vision_history/grounding/`.\n- Save and run JSON recipes, and record successful actions through learning\n  mode.\n- Report runtime readiness with `sootie doctor` before an MCP client depends on\n  the desktop session.\n\n## How It Works\n\nSootie runs as an MCP server over stdio and exposes short tool names such as\n`context`, `click`, and `browser_open` with portable argument and response\nshapes. Older `sootie_*` tool names remain accepted as compatibility aliases\nfor direct JSON-RPC callers and saved recipes. Each target is resolved through\nthe strongest available signal:\n\n1. Browser CDP for DOM-backed pages.\n2. Native platform backends for apps, windows, and desktop state.\n3. Vision grounding when structural signals are not enough.\n\nA `vision-only` mode is also available when you want to test or force the visual\ngrounding path directly.\n\n## Install\n\nSootie currently publishes package-manager installs for macOS and Linux amd64.\nWindows users install from source while the package-manager path is being\nfinalized.\n\n| Platform | Install path | Notes |\n| --- | --- | --- |\n| macOS arm64/x64 | Homebrew | Requires a GUI session plus Accessibility and Screen Recording permissions for desktop actions. |\n| Linux amd64 | apt | Requires an interactive X11 desktop for desktop actions. The apt package currently targets amd64. |\n| Linux arm64 | Cargo source install | No public apt package yet. |\n| Windows | Cargo source install | No public package-manager path yet. |\n\nmacOS:\n\n```bash\nbrew install joe223/sootie/sootie\nsootie setup\n```\n\nLinux:\n\n```bash\nsudo install -d -m 0755 /usr/share/keyrings\ncurl -fsSL https://raw.githubusercontent.com/joe223/sootie/apt/sootie-archive-keyring.gpg \\\n  | sudo tee /usr/share/keyrings/sootie-archive-keyring.gpg \u003e/dev/null\nsudo chmod 0644 /usr/share/keyrings/sootie-archive-keyring.gpg\ncurl -fsSL https://raw.githubusercontent.com/joe223/sootie/apt/sootie.sources \\\n  | sudo tee /etc/apt/sources.list.d/sootie.sources \u003e/dev/null\nsudo apt-get update\nsudo apt-get install sootie\nsootie setup\n```\n\nWindows:\n\nThe Windows package-manager path is not finalized yet. Until it is published,\ninstall from source with Cargo:\n\n```powershell\ngit clone https://github.com/joe223/sootie.git\ncd sootie\ncargo install --locked --path crates/sootie-cli\nsootie setup\n```\n\nFrom an existing checkout on any platform, the development install path is:\n\n```bash\ncargo install --locked --path crates/sootie-cli\n```\n\n## Install With An Agent\n\nIf you want another coding agent to install Sootie on your computer, copy this\nprompt into that agent. It tells the agent to choose the best install path for\nyour OS, fall back to source when needed, and verify the result before stopping.\n\n```text\nInstall Sootie on this computer and verify that it works.\n\nRules:\n- Detect the operating system and CPU architecture first.\n- Prefer the official install path for this platform:\n  - macOS arm64/x64: Homebrew, using `brew install joe223/sootie/sootie`\n  - Linux amd64: the apt repository documented in the Sootie README\n  - Linux arm64 or Windows: install from source with Rust/Cargo\n- If the package-manager path is unavailable or fails, clone or update\n  https://github.com/joe223/sootie and run\n  `cargo install --locked --path crates/sootie-cli`.\n- Do not overwrite unrelated user files. Ask before destructive changes,\n  uninstalling existing software, or changing global MCP client settings.\n- Run `sootie setup`. If vision dependencies or the model download are too\n  large, blocked, or unnecessary for browser/desktop-only use, run\n  `sootie setup --skip-sidecar` and report that limitation.\n- Verify the installation with:\n  - `sootie --version`\n  - `sootie doctor --check`\n  - `sootie tools --raw`\n- Confirm that `sootie tools --raw` returns 57 tools and includes\n  `browser_open`.\n- Configure my MCP client to run `sootie serve` only if I explicitly ask you to\n  configure that client.\n- Report the install method, binary path, version, verification results, and any\n  remaining manual permission steps, such as macOS Accessibility or Screen\n  Recording.\n```\n\n## Quick Start\n\nCreate the user config:\n\n```bash\nsootie setup\n```\n\nThis writes `~/.config/sootie.config.toml`, installs the bundled vision sidecar,\ncreates the managed Python environment, downloads the default ShowUI-2B model\nwhen it is missing, and verifies that the sidecar can preload the model. Setup\nprints progress while it works. A successful setup means the next `sootie serve`\nand `sootie sidecar` runs are expected to work: Sootie verifies the desktop\nruntime, MCP initialization, tool listing, sidecar startup, and model preload\nbefore returning success.\n\nVision setup needs a Python 3.10-3.13 interpreter. If your default `python3` is\noutside that range, install a compatible Python first. The first setup run also\nneeds network access to install Python packages and download the ShowUI model,\nplus enough disk and memory to preload that model. If you only need browser CDP\nor native desktop structure and do not need vision grounding yet, use\n`sootie setup --skip-sidecar` and run full setup later.\n\nCLI commands print a readable summary by default. Add `--raw` when a script\nneeds the original JSON payload, for example `sootie setup --raw`.\n\nCheck whether the current desktop session is usable:\n\n```bash\nsootie doctor --check\n```\n\nThen configure your MCP client to start Sootie:\n\n```json\n{\n  \"mcpServers\": {\n    \"sootie\": {\n      \"type\": \"stdio\",\n      \"command\": \"sootie\",\n      \"args\": [\"serve\"]\n    }\n  }\n}\n```\n\nFor local development without installing the binary, run:\n\n```bash\ncargo run -p sootie-cli -- serve\n```\n\n## Runtime Check\n\nBefore connecting an agent, check whether the current desktop session is usable:\n\n```bash\nsootie doctor\nsootie doctor --check\nsootie tools\n```\n\n`sootie doctor` prints a readable readiness summary. `sootie doctor --check`\nexits non-zero when the current session is not ready, which makes it suitable\nfor scripts and smoke runs. Use `sootie doctor --raw` or\n`sootie doctor --check --raw` for the full diagnostic JSON. `sootie tools`\nprints a compact tool list; use `sootie tools --raw` for the MCP tool schema.\n\nDefault serve logs are written under the platform data directory. On macOS this\nis:\n\n```text\n~/Library/Application Support/sootie/logs/YYYY-MM-DD-HH-MM-SS.log\n```\n\n## Tool Surface\n\nSootie exposes 57 MCP tools.\n\n| Area | Tools |\n| --- | --- |\n| Orientation and perception | `context`, `state`, `find`, `read`, `inspect`, `element_at`, `screenshot`, `parse_screen`, `ground`, `annotate` |\n| Actions | `click`, `type`, `press`, `hotkey`, `scroll`, `hover`, `long_press`, `drag`, `focus`, `window`, `wait` |\n| Browser-native CDP | `browser_launch`, `browser_connect`, `browser_pages`, `browser_select_page`, `browser_open`, `browser_observe`, `browser_find`, `browser_click`, `browser_type`, `browser_press`, `browser_scroll`, `browser_wait`, `browser_extract`, `browser_screenshot`, `browser_back`, `browser_forward`, `browser_reload`, `browser_close_page`, `browser_shutdown`, `browser_network`, `browser_console`, `browser_storage`, `browser_cookies`, `browser_downloads`, `browser_upload`, `browser_pdf` |\n| Guarded raw CDP | `cdp_send`, `cdp_subscribe` |\n| Recipes and learning | `recipes`, `run`, `recipe_show`, `recipe_save`, `recipe_delete`, `learn_start`, `learn_stop`, `learn_status` |\n\nEvery tool returns MCP content plus structured content with `success`, `data`,\n`context`, `error`, `suggestion`, and a `report` that includes duration and\ntool-call status. `tools/list` includes MCP annotations so clients can\ndistinguish read-only inspection from mutating desktop actions.\n\nSee [MCP Tools Reference](docs/api/mcp-tools-reference.md) for accepted fields,\ninput envelopes, response shapes, and compatibility behavior.\n\n## Browser Automation\n\nSootie uses CDP internally when a supported browser exposes a debugging\nendpoint:\n\n```bash\nSOOTIE_CDP_PORT=9222 sootie serve\n```\n\nFor browser-only work, `browser_launch` starts a managed headless browser\nby default so pages, screenshots, and extraction do not interrupt the user's\nvisible desktop. Pass `mode: \"normal\"` or `headless: false` when the user needs\nto see or manually help with the browser.\n\nmacOS Chrome example:\n\n```bash\n/Applications/Google\\ Chrome.app/Contents/MacOS/Google\\ Chrome \\\n  --remote-debugging-port=9222 \\\n  --user-data-dir=/tmp/sootie-chrome-profile\n```\n\nLinux Chrome example:\n\n```bash\ngoogle-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/sootie-chrome-profile\n```\n\nOn Windows, launch Chrome or Edge with `--remote-debugging-port=9222`, then run\nSootie with `SOOTIE_CDP_PORT=9222`.\n\nCDP is used through the existing portable tools. If CDP is unavailable or the\ntarget is outside browser content, Sootie falls back to the native desktop\nbackend and screenshots. See [Browser Automation with CDP](docs/guides/browser-cdp.md).\n\n## Vision Grounding\n\nBy default, Sootie tries CDP and the platform backend first, then uses vision as\nthe final target-resolution fallback. `sootie setup` writes the default sidecar\nURL and model path into `~/.config/sootie.config.toml`; environment variables\ncan override the sidecar URL:\n\n```bash\nSOOTIE_VISION_URL=http://127.0.0.1:9876 sootie serve\n```\n\nDefault config shape:\n\n```toml\n[resolution]\nstrategy = \"platform-first\"\n\n[vision]\nurl = \"http://127.0.0.1:9876\"\nenabled = true\nconfidence_threshold = 0.5\ntimeout_ms = 60000\nsidecar_dir = \"/path/to/sootie/vision-sidecar\"\nmodel_path = \"/path/to/sootie/models/ShowUI-2B\"\n```\n\nThe Rust MCP server talks to a local HTTP sidecar that implements `POST /ground`.\n`sootie setup` installs that sidecar, installs the Python dependencies listed in\nthe bundled `requirements.txt` into a Sootie-managed virtual environment,\ndownloads `showlab/ShowUI-2B` into Sootie's data directory when missing, and\nchecks that the model can be preloaded. The first setup may take a while because\nthe model download is large and requires network access. Start the sidecar\nbefore using vision-grounded targets:\n\n```bash\nsootie sidecar\n```\n\nUse `sootie sidecar --preload` when you want startup to load the model before\nthe first grounding request.\n\nIf you do not run a vision sidecar, CDP and native desktop automation still work.\nDisable vision with `SOOTIE_VISION_DISABLED=1` or set `enabled = false` in the\nconfig. Set `resolution.strategy = \"vision-only\"` in\n`~/.config/sootie.config.toml` when you want `ground`, `find`,\n`inspect`, and target-based pointer actions to go directly through the\nvision grounding path.\n\nSuccessful grounding calls write annotated JPG screenshots and JSON metadata to:\n\n```text\n/tmp/sootie/vision_history/grounding/\n```\n\nThe JPG overlays the prompt, returned bounding boxes, prediction values, and\nnumbered labels.\n\n## Platform Backends\n\n| Platform | Current backend surface |\n| --- | --- |\n| macOS | AppKit, Accessibility, CoreGraphics, browser Apple Events where needed, and `screencapture`. Grant Accessibility and Screen Recording permissions to the app or terminal that launches Sootie. |\n| Linux | X11-oriented helpers such as `xprop`, `wmctrl`, `xdotool`, AT-SPI bindings, and common screenshot utilities when installed. |\n| Windows | PowerShell, User32, UI Automation, Windows Forms, and System.Drawing from an interactive desktop session. |\n\nThe public MCP contract stays portable while the Rust backend chooses the\nnative mechanism available on the current host.\n\n## Recipes and Learning\n\nRecipes are JSON documents that can be saved, listed, inspected, deleted, and\nrun through the MCP tool surface. A recipe can encode action steps, wait steps,\nparameter substitution, and legacy recorded step shapes.\n\nLearning mode records successful actions so an agent can turn a real desktop\nworkflow into a reusable recipe.\n\nSee [Recipe Schema](docs/api/recipe-schema.md) for the full format.\n\n## Verification\n\nRun the local gates before trusting a binary:\n\n```bash\ncargo fmt --check\ncargo test --workspace\ncargo clippy --workspace --all-targets -- -D warnings\ncargo build --release\n```\n\nFor runtime evidence, use:\n\n- [Real Runtime Checklist](docs/development/real-runtime-checklist.md)\n- [Runtime Smoke Runbook](docs/development/runtime-smoke-runbook.md)\n- [Verification Matrix](docs/development/verification-matrix.md)\n\nThe runtime checks are intentionally separate from compile-time checks: a\nsuccessful MCP handshake or build does not prove that the active desktop\nsession can actually click, type, see screenshots, or ground visual targets.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoe223%2Fsootie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoe223%2Fsootie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoe223%2Fsootie/lists"}