{"id":50100971,"url":"https://github.com/hop-top/ibr","last_synced_at":"2026-05-23T07:16:48.119Z","repository":{"id":349801512,"uuid":"996010038","full_name":"hop-top/ibr","owner":"hop-top","description":"Human instructions translated into X-Path capable of finding the intended data even after a page structure or location change.","archived":false,"fork":false,"pushed_at":"2026-04-21T18:52:14.000Z","size":702,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-21T20:45:35.630Z","etag":null,"topics":["agent-browser","agent-browser-ai","agentic-browser","ai-browser","ai-browser-automation","ai-browser-control","browser-automation","browser-automation-toolkit","browser-cookies","browser-instrumentation","browsing","dom-walking","fill-form","fill-form-automatic","scraper","scraping"],"latest_commit_sha":null,"homepage":"https://hop.top/ibr","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hop-top.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-04T10:24:09.000Z","updated_at":"2026-04-21T18:52:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hop-top/ibr","commit_stats":null,"previous_names":["hop-top/ibr"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hop-top/ibr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hop-top%2Fibr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hop-top%2Fibr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hop-top%2Fibr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hop-top%2Fibr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hop-top","download_url":"https://codeload.github.com/hop-top/ibr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hop-top%2Fibr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33386367,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-23T04:15:53.637Z","status":"ssl_error","status_checked_at":"2026-05-23T04:15:53.242Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-browser","agent-browser-ai","agentic-browser","ai-browser","ai-browser-automation","ai-browser-control","browser-automation","browser-automation-toolkit","browser-cookies","browser-instrumentation","browsing","dom-walking","fill-form","fill-form-automatic","scraper","scraping"],"created_at":"2026-05-23T07:16:47.219Z","updated_at":"2026-05-23T07:16:48.102Z","avatar_url":"https://github.com/hop-top.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ibr - Intent Browser Runtime\n\nAn AI-powered instruction parser that converts human-readable instructions into automated web interactions. Powered by Playwright for browser automation and Vercel AI SDK for multi-provider AI support.\n\n## Features\n\n- **Multi-Provider AI Support**: OpenAI, Anthropic Claude, Google Gemini\n- **Natural Language Instructions**: Describe what you want in plain English\n- **Automatic Element Detection**: AI finds elements based on descriptions\n- **Data Extraction**: Extract structured data from web pages\n- **Conditional Logic**: If-then instructions for dynamic flows\n- **Loop Support**: Repeat actions until conditions are met\n- **Authenticated Sessions**: Inherit browser cookies via `--cookies` flag\n- **Snapshot Diffing**: 85% token reduction in loops via incremental DOM diffs\n- **Tool Runner**: `ibr tool` subcommand — YAML-defined reusable browser workflows with typed params\n- **DOM Inspector**: `ibr snap` subcommand for on-demand page inspection\n- **Daemon Mode**: Optional persistent browser server; warm invocations ~540ms vs ~3800ms cold\n- **Visual Debugging**: `--annotate` / `-a` flag captures annotated PNGs with labeled bounding boxes\n- **Failure Screenshots**: `ANNOTATED_SCREENSHOTS_ON_FAILURE=true` auto-captures on action failure\n- **Dialog Handling**: Auto-accepts browser dialogs; buffers history for inspection\n- **Comprehensive Logging**: Detailed execution logs for debugging\n- **NDJSON Streaming**: `NDJSON_STREAM=true` emits structured browser events for pipeline integration\n\n## Setup\n\n### 1. Clone and Install\n\n```bash\nnpm install @hop/ibr\nnpm run browser:install\n```\n\n### 2. Configure Environment\n\n```bash\ncp .env.example .env\n```\n\nEdit `.env` and configure your AI provider:\n\n#### OpenAI (Default)\n```env\nAI_PROVIDER=openai\nOPENAI_API_KEY=your_api_key_here\n```\n\n#### Anthropic Claude\n```env\nAI_PROVIDER=anthropic\nANTHROPIC_API_KEY=your_api_key_here\n```\n\n#### Google Gemini\n```env\nAI_PROVIDER=google\nGOOGLE_GENERATIVE_AI_API_KEY=your_api_key_here\n```\n\n### 3. (Optional) Configure Browser \u0026 AI Behavior\n\n```env\n# Browser display options\nBROWSER_HEADLESS=false          # Show browser window (false) or run headless (true)\nBROWSER_SLOWMO=100              # Slow down actions in milliseconds (helps with debugging)\n\n# AI behavior\nAI_TEMPERATURE=0                # 0 for deterministic, higher for more creative\nAI_MODEL=gpt-4-mini            # Override default model (optional)\n```\n\n## Usage\n\n### Basic Example\n\n```bash\nibr \"url: https://example.com\ninstructions:\n  - click the 'submit' button\n  - extract page title\"\n```\n\n### Authenticated Sessions (`--cookies`)\n\nImport real browser cookies so ibr can reach pages that require a logged-in\nsession. Supports macOS and Linux Chromium-based browsers. Reads directly from\nthe browser's on-disk SQLite\ncookie database; no proxy, no extension, no manual export needed.\n\n**Requires:** `better-sqlite3` (native addon, already listed in `package.json`).\nOn macOS, the first import also requires Keychain access for the target browser.\n\n#### Syntax\n\n```\nibr --cookies \u003cbrowser\u003e[:\u003cdomain1\u003e,\u003cdomain2\u003e,...] \"\u003cprompt\u003e\"\n```\n\n| Part | Description |\n|------|-------------|\n| `\u003cbrowser\u003e` | Browser alias (see table below) |\n| `:\u003cdomain1\u003e,\u003cdomain2\u003e` | Optional domain filter — import only these host keys |\n\n#### Supported Browsers\n\n| Alias | Browser | Platforms |\n|-------|---------|-----------|\n| `chrome` | Google Chrome | macOS, Linux |\n| `brave` | Brave | macOS, Linux |\n| `edge` | Microsoft Edge | macOS, Linux |\n| `arc` | Arc | macOS |\n| `comet` | Comet (Perplexity) | macOS |\n| `chromium` | Chromium | Linux |\n\n#### Examples\n\n**All cookies from Chrome — access any auth-gated page:**\n\n```bash\nibr --cookies chrome \"url: https://github.com\ninstructions:\n  - extract repository list\"\n```\n\n**Comet cookies for a single domain:**\n\n```bash\nibr --cookies comet:reddit.com \"url: https://www.reddit.com/r/programming\ninstructions:\n  - extract top 5 post titles\"\n```\n\n**Arc cookies scoped to two domains:**\n\n```bash\nibr --cookies arc:github.com,linear.app \"url: https://linear.app\ninstructions:\n  - list my open issues\"\n```\n\n**Brave with no domain filter (all non-expired cookies):**\n\n```bash\nibr --cookies brave \"url: https://app.example.com\ninstructions:\n  - click 'Dashboard'\"\n```\n\n**Edge for a specific domain:**\n\n```bash\nibr --cookies edge:outlook.com \"url: https://outlook.com\ninstructions:\n  - extract unread message subjects\"\n```\n\n#### Domain Filtering\n\n- `--cookies arc:github.com,linear.app` — import only cookies whose `host_key`\n  matches `github.com` or `linear.app`.\n- `--cookies chrome` (no filter) — import **all non-expired** cookies from\n  Chrome's Default profile.\n- Expired cookies are always excluded regardless of filter.\n\n#### How It Works\n\n1. Resolves the browser's cookie DB path under:\n   - macOS: `~/Library/Application Support/\u003cbrowser\u003e/Default/Cookies`\n   - Linux: `${XDG_CONFIG_HOME:-~/.config}/\u003cbrowser\u003e/Default/Cookies`\n2. Retrieves the Safe Storage password:\n   - macOS: `security find-generic-password` from Keychain — **a permission dialog appears on first run; click \"Allow\"**\n   - Linux: fixed Chromium fallback password `peanuts`\n3. Derives a 16-byte AES key via PBKDF2 (SHA-1, 1003 iterations, salt\n   `saltysalt`).\n4. Decrypts each `v10`-prefixed cookie value with AES-128-CBC.\n5. Injects resulting cookies into the Playwright browser context via\n   `context.addCookies()` before any navigation.\n\n#### Limitations\n\n- **Windows not yet supported**.\n- Reads the **Default** profile only; named profiles not yet supported.\n- The derived key is cached per-process; macOS subsequent calls for the same\n  browser skip the dialog.\n\n#### Error Cases\n\n| Error code | Cause | Action |\n|------------|-------|--------|\n| `not_installed` | Browser cookie DB not found on disk | Install the browser or check the alias spelling |\n| `keychain_denied` | User clicked \"Deny\" in the macOS dialog | Re-run and click \"Allow\" |\n| `keychain_timeout` | macOS Keychain dialog not answered within 10 s | Re-run and respond to the dialog promptly |\n| `keychain_not_found` | No macOS Keychain entry for that browser | Browser may not be a Chromium build; check alias |\n| `db_locked` | DB still locked after copy attempt | Close the browser and retry |\n| `db_corrupt` | SQLite DB is corrupt | Reinstall or reset the browser profile |\n\nIf the **DB is locked** (browser is open), ibr automatically copies the DB and\nits WAL/SHM files to `/tmp` and reads from the copy — the original is never\nwritten to. The temp files are deleted when the import finishes.\n\n### `--mode` Flag\n\nControl which page-representation mode the AI receives:\n\n| Flag | Mode | When to use |\n|------|------|-------------|\n| _(none)_ | `auto` | Default — quality-based selection |\n| `--mode aria` | ARIA tree | Force ariaSnapshot (semantic, compact) |\n| `--mode dom` | DOM + XPath | Force DomSimplifier (raw structure) |\n\n```bash\n# Force ARIA mode\nibr --mode aria \"url: https://example.com\\ninstructions:\\n  - click submit\"\n\n# Force DOM mode — canvas apps, legacy table-soup, unlabelled SPAs\nibr --mode dom \"url: https://canvas-app.example.com\\ninstructions:\\n  - click submit\"\n```\n\n**Auto-selection logic** (default): after capturing the ARIA snapshot, ibr\nscores its quality and picks the best mode automatically:\n\n- `sparsityRatio \u003e 0.4` → dom (too many unlabelled interactive elements)\n- snapshot \u003e 50 000 chars → dom (too large)\n- snapshot empty or errored → dom\n- otherwise → aria\n\n**Sparsity** = ratio of unnamed interactive elements to all interactive elements.\nAn unnamed interactive element looks like `- button \"\"` or `- link \"\"` in the ARIA\ntree — no accessible name, so the AI cannot reliably target it. When more than 40%\nof buttons/links have no name, the snapshot is too sparse to be useful.\n\nLogs report the chosen mode and reason, e.g.:\n\n```\nusing aria mode\nfalling back to dom mode: sparse (0.62)\nfalling back to dom mode: size\nfalling back to dom mode: empty\n```\n\n**Site-type guidance:**\n\n| Mode | Works best on |\n|------|---------------|\n| `aria` | Modern semantic SPAs, accessible sites, form-heavy UIs |\n| `dom` | Canvas-heavy apps, legacy table-soup HTML, Shadow DOM, `aria-hidden`-heavy pages |\n| `auto` | Unknown sites; safe default — quality-checked per page |\n\n### Visual Debugging (`--annotate`)\n\nCapture annotated screenshots that show exactly which elements the AI resolved —\nuseful when a flow behaves unexpectedly and you want visual confirmation.\n\n#### `--annotate` / `-a` flag\n\n```bash\nibr --annotate \"url: https://example.com\\ninstructions:\\n  - click submit\"\nibr -a \"url: https://example.com\\ninstructions:\\n  - click submit\"\n```\n\nAfter each element-resolution step that finds ≥1 element, ibr captures a\nfull-page PNG with red bounding-box overlays labeled `@e1`, `@e2`, etc.\n(DOM elements) or `@c1`, `@c2`, etc. (pseudo-buttons / cursor-interactive\nelements).\n\nOutput path: `/tmp/ibr-annotate-step-\u003cN\u003e-\u003ctimestamp\u003e.png`\n\n#### `ANNOTATED_SCREENSHOTS_ON_FAILURE`\n\n```bash\nANNOTATED_SCREENSHOTS_ON_FAILURE=true ibr \"url: https://example.com\\n...\"\n```\n\nWhen set to `true`, ibr automatically captures an annotated screenshot\nwhenever an action fails — without requiring `--annotate` on every run.\n\nOutput path: `/tmp/ibr-failure-step-\u003cN\u003e-\u003ctimestamp\u003e.png`\n\nThe screenshot is non-fatal: if capture fails (e.g. due to a Content\nSecurity Policy), execution continues and a warning is logged.\n\n#### Notes\n\n- Overlays are injected via `page.evaluate` (no image library dependency).\n- Off-screen / hidden elements are silently skipped (no overlay).\n- Overlay `\u003cdiv\u003e`s are always removed after screenshot (even on error).\n- Path validation: only `/tmp` and the current working directory are accepted.\n\n---\n\n### Prompt Format\n\nInstructions use a YAML-like format:\n\n```yaml\nurl: https://example.com/page\ninstructions:\n  - click 'button text' if found       # Conditional action\n  - fill 'email' with user@test.com    # Fill form field\n  - type 'search term' into search box # Type into focused element\n  - press Enter                        # Press keyboard key\n  - extract text: title, price, url    # Extract data\n  - repeatedly:                        # Loop until condition fails\n      - click 'next page' if found\n```\n\n### Instruction Types\n\n#### 1. Click/Fill/Type Actions\n```yaml\n- click 'element description'\n- fill 'field description' with value\n- type 'text' into element\n- press KeyName (Enter, Space, Escape, etc.)\n```\n\n#### 2. Conditional (if found)\n```yaml\n- click 'close banner' if found        # Executes if element exists\n```\n\n#### 3. Loops (repeatedly)\n```yaml\n- repeatedly:\n    - click 'load more' if found       # Continues until condition fails\n    - extract items\n```\n\n#### 4. Data Extraction\n```yaml\n- extract: title, price, rating        # Extract text content\n- extract all product names            # Extract list of items\n```\n\n### Real-World Example\n\n```bash\nibr \"url: https://www.example.com/products\ninstructions:\n  - close 'cookie banner' if found\n  - scroll to bottom of page\n  - repeatedly:\n      - click 'load more' if found\n      - wait for new items to load\n  - extract all products: name, price, rating\n  - navigate to first product\n  - extract product details: description, reviews\"\n```\n\n## Daemon Mode (opt-in, fast warm invocations)\n\nBy default every `ibr` call spawns a new Chromium instance (~3800 ms cold-start).\nDaemon mode keeps a browser process alive in the background so subsequent calls\nconnect to it instead (~540 ms warm).\n\n### Enable\n\n```bash\n# Per-session env var\nexport IBR_DAEMON=true\nibr \"url: https://example.com\\ninstructions:\\n  - extract title\"\n\n# Per-invocation flag\nibr --daemon \"url: https://example.com\\ninstructions:\\n  - extract title\"\n\n# Start the server manually (optional — auto-started on first call)\nnpm run server\n```\n\n### How It Works\n\n1. CLI checks `IBR_DAEMON=true` or `--daemon` flag.\n2. Reads `~/.ibr/server.json` (pid, port, bearer token).\n3. Validates process is alive + `/health` responds OK.\n4. If stale/missing: spawns `src/server.js` detached and polls until ready (≤8 s).\n5. `POST /command` with Bearer token; server reuses the warm browser context.\n6. Server auto-shuts down after 30 min idle.\n\n### Security\n\n- Localhost-only (`127.0.0.1`); not accessible remotely.\n- Random OS-assigned port per startup.\n- Bearer token is a UUID, stored in `~/.ibr/server.json` (mode `0600`).\n\n### State File\n\n`~/.ibr/server.json` — written atomically; override path with `IBR_STATE_FILE`.\n\n```json\n{ \"pid\": 12345, \"port\": 51234, \"token\": \"\u003cuuid\u003e\", \"startedAt\": 1234567890 }\n```\n\n### Latency Breakdown\n\n| Mode | Time |\n|------|------|\n| Cold (stateless, default) | ~3800 ms |\n| Warm (daemon) | ~540 ms |\n\n### Disable / Stop\n\n```bash\nunset IBR_DAEMON           # revert to stateless for this session\nkill $(jq .pid ~/.ibr/server.json)  # stop the daemon manually\n```\n\n---\n\n## Tool Runner (`ibr tool`)\n\nRun pre-packaged browser workflows defined as YAML files in `tools/`.\nParams use `{{placeholder}}` syntax; defaults applied when omitted.\n\n```\nibr tool \u003cname\u003e [--param key=value ...]\nibr tool --list\n```\n\n### Built-in Tools\n\n| Name | Description | Required Params |\n|------|-------------|----------------|\n| `web-search` | Search the web; extract ranked results | `query` |\n| `web-fetch` | Fetch a URL; extract main content | `url` |\n| `trend-search` | Google Trends interest + related queries | `topic` |\n| `github-search` | GitHub repo/code/issue/user search | `query` |\n| `github-trending` | GitHub trending repos by language/period | _(all optional)_ |\n| `github-starred` | Browse a user's starred repos | `username` |\n\n### Examples\n\n```bash\n# Web search\nibr tool web-search --param query=\"openai agents\"\n\n# Google Trends (defaults: region=US, period=7d)\nibr tool trend-search --param topic=javascript --param region=GB\n\n# GitHub repo search\nibr tool github-search --param query=playwright --param type=repositories\n\n# GitHub trending (all params optional)\nibr tool github-trending --param language=go --param period=weekly\n\n# User's starred repos with keyword filter\nibr tool github-starred --param username=sindresorhus --param query=rust\n\n# List available tools\nibr tool --list\n```\n\n### YAML Tool Format\n\nPlace `.yaml` files in `tools/` and they become available as `ibr tool \u003cname\u003e`:\n\n```yaml\nname: my-tool\ndescription: \"Short description\"\nparams:\n  - name: query\n    description: \"Search query\"\n    required: true\n  - name: count\n    description: \"Number of results\"\n    default: \"5\"\nurl: \"https://www.google.com\"\ninstructions:\n  - type {{query}} into the search box and press Enter\n  - extract the top {{count}} results with titles and URLs\n```\n\n`{{param}}` placeholders are interpolated in both `url` and `instructions`.\nMissing required params → non-zero exit with a clear error before the browser starts.\n\n### VCR Test Record Mode\n\nE2E tests for tools use cassette replay by default. To record real cassettes\nfrom live sites (run once, then commit):\n\n```bash\nVCR_RECORD=true OPENAI_API_KEY=sk-... \\\n  node node_modules/vitest/vitest.mjs run test/e2e/cli-tool-vcr.test.js\n```\n\nThe proxy forwards requests to the real AI endpoint, captures responses, and\nwrites updated cassette files to `test/e2e/cassettes/`.\n\n---\n\n## DOM Inspector (`ibr snap`)\n\nInspect the live DOM of any page without writing a full task. Outputs simplified DOM JSON\nto stdout; useful for debugging selectors, auditing interactive elements, or feeding\ncontext to other tools.\n\n```\nibr snap \u003curl\u003e [flags]\n```\n\n### Flags\n\n| Flag | Description |\n|------|-------------|\n| `--aria` | Show ariaSnapshot (ARIA YAML tree) instead of DOM JSON |\n| `-i` | Interactive only: dom → xpath-indexed nodes; aria → role+name lines |\n| `-a` | Annotated screenshot → `/tmp/ibr-dom-annotated.png` (dom mode only) |\n| `-d \u003cN\u003e` | Depth limit — truncate tree at depth N (dom mode only) |\n| `-s \u003cselector\u003e` | Scope output to a CSS selector subtree (dom mode only) |\n\nTwo representations are available:\n\n- **DOM tree** (default): XPath-indexed JSON — what the AI sees in dom mode\n- **ARIA snapshot** (`--aria`): Playwright `ariaSnapshot()` YAML — what the AI sees in aria mode\n\nUse `--aria` to verify what the AI reasons over when running with `--mode aria`.\n\n### Examples\n\n```bash\n# Full DOM of a page (dom mode representation)\nibr snap https://example.com\n\n# ARIA snapshot (aria mode representation)\nibr snap --aria https://example.com\n\n# ARIA snapshot — interactive elements only (role + name present)\nibr snap --aria -i https://example.com\n\n# DOM: interactive elements only (reduces noise)\nibr snap https://example.com -i\n\n# DOM: annotated screenshot highlighting all interactive elements\nibr snap https://example.com -a\n\n# DOM: limit tree depth to 4 levels\nibr snap https://example.com -d 4\n\n# DOM: scope to the navigation bar only\nibr snap https://example.com -s \"nav\"\n\n# DOM: interactive elements in sidebar, depth 3, with screenshot\nibr snap https://example.com -i -s \"#sidebar\" -d 3 -a\n```\n\nDOM mode outputs JSON on stdout with header `=== DOM Tree ===`.\nARIA mode outputs YAML on stdout with header `=== ARIA Snapshot ===`.\nWith `-a` (dom), the screenshot path is printed to stderr:\n\n```bash\n# Capture DOM JSON and screenshot in one shot\nibr snap https://example.com -i -a \u003e dom.json\n# → stderr: Annotated screenshot: /tmp/ibr-dom-annotated.png\n```\n\n---\n\n## Lightpanda — fast headless mode\n\n[Lightpanda](https://github.com/lightpanda-io/browser) is a Zig-built headless\nbrowser with roughly 9× faster startup and 16× less memory than Chromium. ibr\ncan auto-download it and drive it via Playwright CDP — no manual install.\n\n**One-liner** (auto-downloads stable release on first run, caches under\n`~/.cache/ibr/browsers/lightpanda/`):\n\n```bash\nBROWSER_CHANNEL=lightpanda ibr \"go to example.com and extract the heading\"\n```\n\n**With fallback** (recommended during lightpanda beta — ibr silently retries\non chromium when a scenario hits an unimplemented Web API and records the\nfailure in a capability manifest for future pre-flight warnings):\n\n```bash\nBROWSER_CHANNEL=lightpanda BROWSER_FALLBACK=chromium ibr \"...\"\n```\n\n**Pre-warm the cache in CI** (avoids first-run download latency):\n\n```bash\nibr browser pull lightpanda stable\n```\n\n**Inspect current resolver decision**:\n\n```bash\nibr browser which\n```\n\n**Lifecycle modes**\n\n- **Connect-only** — set `BROWSER_CDP_URL=ws://127.0.0.1:9222` to connect to\n  an already-running CDP server (you manage the lifecycle).\n- **Daemon-owned** — long-running `IBR_DAEMON=true`; the server spawns +\n  reuses the browser across requests.\n- **One-shot** — default CLI mode; spawn + connect + teardown per invocation.\n\nSee `docs/testing-lightpanda.md` for the gated e2e suite and known compat gaps.\n\n### `ibr browser` subcommands\n\n```\nibr browser list                     Show registry + cache state\nibr browser pull [channel] [version] Pre-warm browser cache\nibr browser prune [--older-than]     GC old cache entries\nibr browser which                    Print resolver decision for current env\n```\n\n---\n\n## Snapshot Diffing (Automatic)\n\n**Internal optimization — no user action required.**\n\nIn loops and multi-step tasks, ibr tracks snapshots between actions. Instead of\nsending the full page representation to the AI on every step, it sends only what changed.\n\n- **~85% token reduction** in typical loop workflows\n- **dom mode**: compares added, removed, modified nodes by XPath identity\n- **aria mode**: line-based set diff of ariaSnapshot YAML lines; keyed by role+name\n- Falls back to full snapshot automatically when:\n  - Navigation occurs (page URL changes)\n  - \u003e50% of nodes/lines change (large-change threshold)\n  - Snapshot is older than 5 minutes (stale threshold)\n  - Mode mismatch between stored and current snapshot\n\nNo configuration needed. Token savings are logged at DEBUG level per `#findElements`\ncall (`usedDiff`, `diffSummary`).\n\n---\n\n## How It Works\n\n1. **Parse Instructions**: AI converts your natural language prompt into structured JSON\n2. **Navigate**: Opens the URL in a Playwright browser\n3. **Execute**: For each instruction:\n   - Captures page as ARIA accessibility tree (semantic snapshot)\n   - Asks AI to identify elements by `{role, name}` ARIA descriptor\n   - Performs the action (click, fill, extract, etc.)\n   - Tracks token usage across all providers\n4. **Extract Data**: Returns extracted information from the page\n\n### Page Representation — ARIA Accessibility Tree\n\nibr uses Playwright's `ariaSnapshot()` to represent pages to the AI, instead of\nraw DOM/HTML. The ARIA snapshot is a hierarchical accessibility tree: roles, names,\nlabels, and visible text — the same structure used by screen readers.\n\n**Why it matters:**\n- ~38x smaller context (e.g. 592 kB raw DOM → ~15 kB ARIA snapshot on large pages)\n- Semantically cleaner: no inline styles, script blocks, SVG noise\n- More reliable element targeting: AI returns `{role, name}` descriptors, which\n  Playwright resolves via `getByRole` / `getByLabel` / `getByText`\n\n**Mode selection:** by default ibr scores the ARIA snapshot quality (sparsity\nratio of unlabelled interactive elements) and falls back to `DomSimplifier`\n(XPath-indexed JSON tree) when quality is too low, snapshot is too large, or\nsnapshot is empty. Use `--mode aria|dom` to override. See the `--mode` section\nabove for details.\n\n**Element descriptor format (ARIA path):**\n\n```json\n{\"role\": \"button\", \"name\": \"Sign in\"}\n{\"role\": \"textbox\", \"name\": \"Email\"}\n{\"role\": \"link\", \"name\": \"Learn more\"}\n```\n\n**Element descriptor format (DomSimplifier fallback):**\n\n```json\n{\"x\": 42}\n```\n\nThe switch is internal; prompts you write are unaffected — keep describing\nelements in plain English as always.\n\n## Debugging \u0026 Troubleshooting\n\n### Issue: Action Timeouts\n\n**Symptom**: \"Timeout waiting for element\" error\n\n**Solution**:\n- The element may be behind a modal or banner\n- Add instruction to close/dismiss overlays first\n- Check browser window to see what's blocking the action\n- Use `BROWSER_SLOWMO` to slow down and observe\n\n```bash\nBROWSER_SLOWMO=500 ibr \"...\"\n```\n\n### Issue: Element Not Found\n\n**Symptom**: `Unable to resolve element descriptor: {...}` or action failure\nmentioning `hidden, disabled, or covered by another element`\n\n**Solutions**:\n- Run `ibr snap \u003curl\u003e -i` to list interactive elements and their `@refs`\n- Use the exact role/name from the snapshot in your prompt\n- If the element is a pseudo-button (no ARIA role), use `--mode dom` and\n  reference its index\n- If the element is present but action fails, check for overlays (modal,\n  cookie banner) and add an instruction to dismiss them first\n\n### Issue: JSON Parsing Errors\n\n**Symptom**: `Failed to parse task description` or `BAML parser: Unable to extract JSON`\n\n**Solutions**:\n- Set `AI_TEMPERATURE=0` for deterministic outputs\n- Ensure prompt includes `url:` and `instructions:` fields\n- If using a custom model, verify it returns structured JSON responses\n- Try switching `AI_MODEL` to a model known to follow instructions reliably\n\n### Issue: Infinite Loops\n\n**Symptom**: Script runs forever on a \"repeatedly\" instruction\n\n**Solutions**:\n- The loop condition never becomes false\n- Make sure the condition you're checking actually disappears when done\n- Script has a safety limit of 100 iterations to prevent hangs\n- Check logs to see which iteration it's stuck on\n\n### Enable Detailed Logging\n\nSee what's happening at each step:\n\n```bash\nDEBUG=* ibr \"...\"\n```\n\nLogs show:\n- Which AI provider and model is being used\n- Prompt and response tokens for cost tracking\n- What elements were found\n- What actions were executed\n- Detailed error messages if anything fails\n\n### Monitor in Real-Time\n\nKeep the browser visible and slow it down:\n\n```bash\nBROWSER_HEADLESS=false BROWSER_SLOWMO=500 ibr \"...\"\n```\n\nNow you can watch exactly what the script is doing and see where it fails.\n\n## Configuration Reference\n\n### AI Configuration\n| Variable | Options | Default | Purpose |\n|----------|---------|---------|---------|\n| `AI_PROVIDER` | openai, anthropic, google | openai | Which AI service to use |\n| `AI_TEMPERATURE` | 0-2 | 0 | Response randomness (0=deterministic) |\n| `AI_MODEL` | Model name | Provider default | Override default model |\n\n### Browser Configuration\n| Variable | Values | Default | Purpose |\n|----------|--------|---------|---------|\n| `BROWSER_HEADLESS` | true/false | false | Run browser headless |\n| `BROWSER_SLOWMO` | milliseconds | 100 | Slow down browser actions |\n| `BROWSER_TIMEOUT` | milliseconds | 30000 | Page load timeout |\n| `BROWSER_CHANNEL` | chrome/brave/arc/comet/chromium/msedge/lightpanda | _(chromium)_ | Browser to launch |\n| `BROWSER_EXECUTABLE_PATH` | path | — | Direct binary override; bypasses probe + cache |\n| `BROWSER_CDP_URL` | ws URL | — | Connect to running CDP server; skips spawn |\n| `LIGHTPANDA_WS` | ws URL | — | **Deprecated** alias of `BROWSER_CDP_URL` |\n| `BROWSER_VERSION` | stable/nightly/latest/exact | stable | Version for downloadable browsers |\n| `BROWSER_DOWNLOAD_URL` | URL | — | Mirror / air-gap binary source |\n| `BROWSER_FALLBACK` | channel name | — | Fallback channel on lightpanda failure |\n| `BROWSER_STRICT` | true/false | false | Refuse launch on known-broken capability entries |\n| `BROWSER_REQUIRE_CHECKSUM` | true/false | false | Refuse install without sha256 checksum |\n| `LIGHTPANDA_TELEMETRY` | true/false | false | Opt-in lightpanda upstream telemetry |\n| `OBEY_ROBOTS` | true/false | false | Check robots.txt before automation |\n| `DIALOG_AUTO_ACCEPT` | true/false | true | Auto-accept browser dialogs (alert/confirm/prompt) |\n| `DIALOG_BUFFER_CAPACITY` | number | 50000 | Max dialog events to buffer |\n| `DIALOG_DEFAULT_PROMPT_TEXT` | string | '' | Default text submitted for prompt() dialogs |\n\n### Daemon Configuration\n| Variable | Values | Default | Purpose |\n|----------|--------|---------|---------|\n| `IBR_DAEMON` | true/false | false | Enable persistent browser daemon |\n| `IBR_STATE_FILE` | path | `~/.ibr/server.json` | Daemon state file path |\n\n### Observability\n| Variable | Values | Default | Purpose |\n|----------|--------|---------|---------|\n| `NDJSON_STREAM` | true/false | false | Stream browser events as NDJSON to stdout |\n| `ANNOTATED_SCREENSHOTS_ON_FAILURE` | true/false | false | Auto-capture annotated PNG on action failure |\n\n### API Keys (REQUIRED)\n- `OPENAI_API_KEY` - For OpenAI provider\n- `ANTHROPIC_API_KEY` - For Anthropic provider\n- `GOOGLE_GENERATIVE_AI_API_KEY` - For Google provider\n\nOnly set the key for your selected provider.\n\n## Output\n\n### Extracted Data\n```json\n[\n  {\n    \"title\": \"Product Name\",\n    \"price\": \"$99.99\",\n    \"rating\": \"4.5 stars\"\n  }\n]\n```\n\n### Token Usage\n```\nToken usage summary {\n  promptTokens: 1250,\n  completionTokens: 450,\n  totalTokens: 1700\n}\n```\n\n## Common Patterns\n\n### Scrape Paginated Content\n\n```yaml\nurl: https://example.com/products\ninstructions:\n  - repeatedly:\n      - extract all items: name, price\n      - click 'next page' if found\n```\n\n### Fill and Submit Form\n\n```yaml\nurl: https://example.com/contact\ninstructions:\n  - fill 'name' with John Doe\n  - fill 'email' with john@example.com\n  - fill 'message' with Hello World\n  - click 'submit button'\n  - extract confirmation message\n```\n\n### Handle Dynamic Content\n\n```yaml\nurl: https://example.com\ninstructions:\n  - click 'load more' if found\n  - wait for content to load\n  - repeatedly:\n      - scroll down\n      - click 'load more' if found\n      - extract new items\n  - extract final data\n```\n\n## Tips for Best Results\n\n1. **Be Descriptive**: Instead of \"button\", use \"submit button at bottom of form\"\n2. **Consider DOM Changes**: Elements may not be in the same place after actions\n3. **Handle Common Issues**: Banners, popups, logins often block actions\n4. **Test Incrementally**: Start with simple instructions and build up\n5. **Watch Execution**: Use browser window to see what's happening\n6. **Check Logs**: Detailed logs show exactly what failed and why\n7. **Use Deterministic AI**: Keep `AI_TEMPERATURE=0` for consistent results\n\n## Limitations\n\n- Requires API key for selected AI provider\n- `--cookies` flag supports macOS and Linux; Windows is not yet supported\n- May struggle with heavily JavaScript-rendered content\n- No built-in retry on transient failures (but logs indicate when/why to retry)\n- Browser automation is slower than direct API calls\n\n## LLM Judge (Extraction Quality)\n\nScores E2E extraction results against fixture ground-truth using an LLM judge (0–10 scale).\n\n```bash\n# Run after E2E tests have written results to test/results/e2e/\nnpm run judge:e2e\n\n# Options\nnpm run judge:e2e -- --threshold 8          # fail if mean \u003c 8 (default 7)\nnpm run judge:e2e -- --validate             # 3-run variance check\nnpm run judge:e2e -- --run-id my-run        # label the report\nnpm run judge:e2e -- --output-dir /tmp/out  # custom results dir\n```\n\nWrites `\u003crunId\u003e-quality.json` and `\u003crunId\u003e-summary.md` to the output dir.\nExits 0 when mean score ≥ threshold; exits 1 when below.\nFixtures without `expectedExtracts` are skipped — CI does not fail.\n\n## Building\n\nRequires Node \u003e=20. Run once to produce `dist/ibr` and `dist/ibr-server`:\n\n    task build\n\nBinaries are self-contained (no Node runtime needed). Native deps (Playwright,\nbetter-sqlite3, @boundaryml/baml) must still exist in `node_modules` alongside\nthe binary; they cannot be embedded in the SEA blob.\n\n## Version \u0026 Upgrade\n\n```bash\nibr version                  # human-readable version string\nibr version --short          # version only — scriptable (e.g. in CI checks)\nibr version --json           # JSON: version, node, platform, arch\nibr upgrade                  # check for and install available updates\nibr upgrade --auto           # non-interactive install\nibr upgrade --quiet          # suppress output (use exit code only)\nibr upgrade preamble         # emit agent skill preamble fragment (for AI agent configs)\n```\n\n---\n\n## Related Tools\n\n| Tool | Notes |\n|------|-------|\n| [LightPanda](https://github.com/lightpanda-io/browser) | Headless browser in Zig; 11x faster, 9x less memory than Chrome. CDP-compatible. Beta — CORS gap limits Playwright parity today. Potential future ibr backend for high-throughput scraping workloads. |\n\n## License\n\nISC\n\n## Support\n\nFor issues and feedback, see `.env.example` for configuration help or check logs for detailed error messages.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhop-top%2Fibr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhop-top%2Fibr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhop-top%2Fibr/lists"}