{"id":49756938,"url":"https://github.com/amankumarsingh77/pi-browser-harness","last_synced_at":"2026-05-10T22:56:58.516Z","repository":{"id":354496526,"uuid":"1223903224","full_name":"amankumarsingh77/pi-browser-harness","owner":"amankumarsingh77","description":"Browser control extension for pi — navigate, click, type, screenshot, and extract data from real Chrome via CDP","archived":false,"fork":false,"pushed_at":"2026-05-07T18:48:29.000Z","size":1649,"stargazers_count":1,"open_issues_count":1,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-10T22:56:57.163Z","etag":null,"topics":["browser-automation","cdp","chrome-devtools-protocol","pi","pi-package","web-automation"],"latest_commit_sha":null,"homepage":"https://github.com/amankumarsingh77/pi-browser-harness#readme","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amankumarsingh77.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-28T19:10:43.000Z","updated_at":"2026-05-07T18:48:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/amankumarsingh77/pi-browser-harness","commit_stats":null,"previous_names":["amankumarsingh77/pi-browser-harness"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amankumarsingh77/pi-browser-harness","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amankumarsingh77%2Fpi-browser-harness","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amankumarsingh77%2Fpi-browser-harness/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amankumarsingh77%2Fpi-browser-harness/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amankumarsingh77%2Fpi-browser-harness/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amankumarsingh77","download_url":"https://codeload.github.com/amankumarsingh77/pi-browser-harness/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amankumarsingh77%2Fpi-browser-harness/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32874700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-10T13:40:02.631Z","status":"ssl_error","status_checked_at":"2026-05-10T13:40:02.145Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["browser-automation","cdp","chrome-devtools-protocol","pi","pi-package","web-automation"],"created_at":"2026-05-10T22:56:57.770Z","updated_at":"2026-05-10T22:56:58.509Z","avatar_url":"https://github.com/amankumarsingh77.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pi-browser-harness\n\n![pi-browser-harness](https://raw.githubusercontent.com/amankumarsingh77/pi-browser-harness/main/assets/hero.png)\n\nFull browser control for pi agents in **your real Chrome** — your sessions, your cookies, your tabs. Drives navigation, structured page reads, network capture, clicks, typing, screenshots, and arbitrary scripts via CDP.\n\n---\n\n## Why pi-browser-harness?\n\n| Capability | pi-browser-harness | Playwright MCP | Stagehand | Puppeteer MCP |\n|---|:---:|:---:|:---:|:---:|\n| **Drives your real Chrome** (logged-in sessions preserved) | ✅ | ❌ launches its own browser | ❌ | ❌ |\n| **Coordinate clicks** that work through iframes, shadow DOM, cross-origin | ✅ | ❌ selector-based | ❌ selector-based | ❌ selector-based |\n| **Inline TUI screenshot rendering** (Kitty/iTerm2/Ghostty/WezTerm) | ✅ | ❌ | ❌ | ❌ |\n| **Accessibility-tree snapshot with click coords `@(x,y)` per element** | ✅ | ✅ tree only, no coords | partial | ❌ |\n| **Network request capture** with filters + body capture | ✅ | ✅ post-hoc list | ❌ | ❌ |\n| **Parallel tool execution** with automatic mutation serialization | ✅ | ❌ | ❌ | ❌ |\n| **Temporary scripts** with daemon + full Node.js | ✅ | ❌ | ❌ | ❌ |\n| **Direct HTTP GET** outside the browser (10–50× faster for APIs) | ✅ | ❌ | ❌ | ❌ |\n| **Tab ownership isolation** — never touches the user's other tabs | ✅ | N/A | N/A | N/A |\n| **Pi-native** — no MCP/JSON-RPC overhead, no extra LLM API keys | ✅ | ❌ MCP roundtrip | ❌ external LLM | ❌ MCP roundtrip |\n| **TypeScript strict mode**, zero `any`, all CDP casts documented | ✅ | unknown | unknown | unknown |\n| Ctrl+O expand/collapse on tool output | ✅ | ❌ | ❌ | ❌ |\n| Compositor-level dispatch (works on every site, no flakey waits) | ✅ | ❌ | ❌ | ❌ |\n\nIf you live in pi and you want an agent driving the same Chrome you're already signed into, this is the only one that fits.\n\n---\n\n## Quick Start\n\n```bash\n# 1. Install\npi install npm:pi-browser-harness\n\n# 2. Enable Chrome remote debugging\n#    Open chrome://inspect/#remote-debugging in Chrome,\n#    tick \"Discover network targets\", click Allow.\n#    Or launch Chrome with --remote-debugging-port=9222\n\n# 3. Connect\n/browser-setup\n```\n\n### Requirements\n\n- pi (latest)\n- Node.js ≥ 22\n- Chrome / Chromium / Edge\n\n---\n\n## Tool hierarchy — read this first\n\n```\nWhat do you need to know?\n\n  ├─ Page structure / what's clickable / labels?\n  │     → browser_snapshot     (DEFAULT — AX tree with @(x,y) per interactive element)\n  │\n  ├─ A specific element's value / attribute / coords?\n  │     → browser_execute_js   (e.g. el.innerText, el.getBoundingClientRect())\n  │\n  ├─ Network behavior on the current page?\n  │     → browser_network_requests\n  │\n  └─ Visual rendering (layout / colors / chart drew correctly)?\n        → browser_screenshot   (LAST RESORT — pixels only)\n```\n\nPass `@(x,y)` from `browser_snapshot` straight to `browser_click`. **No screenshot round-trip needed** to find click targets — the snapshot already has them.\n\n---\n\n## Tools\n\n### Page inspection (use these by default)\n\n| Tool | Purpose |\n|------|---------|\n| `browser_snapshot` | **Default for page inspection.** Returns the CDP accessibility tree with click coords `@(x,y)` for every interactive element. Optional `includeScreenshot:true`. |\n| `browser_execute_js` | Surgical DOM reads — element text, attributes, `getBoundingClientRect()`. Cheapest, most precise. |\n| `browser_network_requests` | List recent network requests on the current tab. Filter by url/method/status/type/recency; optional response-body capture. |\n| `browser_page_info` | URL, title, viewport, scroll position, or pending dialog. |\n| `browser_http_get` | Direct HTTP GET outside the browser — 10-50× faster for APIs. |\n\n### Visual (last resort)\n\n| Tool | Purpose |\n|------|---------|\n| `browser_screenshot` | Capture PNG/JPEG. Use only when you need to verify visual rendering. |\n\n### Navigation\n\n| Tool | Purpose |\n|------|---------|\n| `browser_navigate` / `browser_new_tab` | Navigate or open a tab |\n| `browser_open_urls` | Open multiple URLs in parallel tabs |\n| `browser_go_back` / `browser_go_forward` / `browser_reload` | History navigation |\n| `browser_list_tabs` / `browser_current_tab` / `browser_switch_tab` / `browser_close_tab` | Tab management (only tabs this session opened) |\n\n### Interaction\n\n| Tool | Purpose |\n|------|---------|\n| `browser_click` | Click at viewport coordinates (use `@(x,y)` from `browser_snapshot`) |\n| `browser_type` | Type text into the focused element |\n| `browser_press_key` | Press a key with optional modifiers |\n| `browser_scroll` | Scroll the page by delta pixels |\n\n### Utility\n\n| Tool | Purpose |\n|------|---------|\n| `browser_wait` / `browser_wait_for_load` | Sleep, or wait for `readyState === 'complete'` |\n| `browser_handle_dialog` | Accept or dismiss `alert` / `confirm` / `prompt` |\n| `browser_run_script` | Execute a temporary script with daemon and Node.js access |\n\nThree tools (`browser_snapshot`, `browser_network_requests`, `browser_execute_js`) ship custom TUI rendering with **Ctrl+O** (`app.tools.expand`) to toggle between compact and full output.\n\n---\n\n## Core patterns\n\n### Page inspection\n\n```\nbrowser_snapshot()\n# → AX outline with @(x,y) per button/link/input\n```\n\n### Form filling (no screenshots)\n\n```\nbrowser_snapshot()                    # find input @(x,y) and labels\nbrowser_click({ x, y })               # click using snapshot's @(x,y)\nbrowser_type({ text: \"query\" })\nbrowser_press_key({ key: \"Enter\" })\nbrowser_wait_for_load()\nbrowser_snapshot()                    # verify next state\n```\n\n### Data extraction\n\n```js\n// One value\nbrowser_execute_js({ expression: \"document.querySelector('.price').innerText\" })\n\n// Direct API call outside the browser\nbrowser_http_get({ url: \"https://api.github.com/repos/amankumarsingh77/pi-browser-harness\" })\n\n// Structured arrays\nbrowser_execute_js({ expression: `JSON.stringify(\n  Array.from(document.querySelectorAll('.result')).map(el =\u003e ({\n    title: el.querySelector('h3')?.textContent,\n    link:  el.querySelector('a')?.href,\n  }))\n)` })\n```\n\n### Network debugging\n\n```\nbrowser_navigate({ url: \"https://app.example.com/feed\" })\nbrowser_wait_for_load()\nbrowser_network_requests({\n  urlPattern: \"/api/\",\n  statusFilter: { min: 400 },\n  includeResponseBodies: true\n})\n```\n\n### Research workflow\n\n```\nbrowser_navigate({ url: \"https://google.com/search?q=...\" })\nbrowser_open_urls({ urls: [\"url1\", \"url2\", \"url3\"] })\nbrowser_list_tabs()\nbrowser_switch_tab({ targetId: \"...\" })\nbrowser_wait_for_load()\nbrowser_snapshot()\nbrowser_execute_js({ expression: \"document.querySelector('.content').innerText\" })\n```\n\n### Visual verification (only when pixels matter)\n\n```\nbrowser_click({ x, y })         # got coords from browser_snapshot\nbrowser_snapshot()              # confirm the form transitioned\nbrowser_screenshot()            # ONLY if you need to verify a chart/modal/CSS rendered correctly\n```\n\n### Keyboard modifiers\n\n| Key | Bit |\n|-----|-----|\n| Alt | 1 |\n| Ctrl | 2 |\n| Meta / Cmd | 4 |\n| Shift | 8 |\n\n```\nbrowser_press_key({ key: \"c\", modifiers: 2 })    // Ctrl+C\nbrowser_press_key({ key: \"v\", modifiers: 4 })    // Cmd+V\nbrowser_press_key({ key: \"T\", modifiers: 10 })   // Ctrl+Shift+T\n```\n\n### Dialogs\n\nJS dialogs freeze the page. Check `browser_page_info` first — if it reports a dialog, handle it before anything else:\n\n```\nbrowser_handle_dialog({ action: \"accept\" })       // confirm\nbrowser_handle_dialog({ action: \"dismiss\" })       // cancel\nbrowser_handle_dialog({ action: \"accept\", promptText: \"hello\" })  // prompt\n```\n\n---\n\n## Parallel execution\n\nObservation tools run in parallel by default. Mutation tools (`click`, `type`, `scroll`, `navigate`, `switch_tab`, …) are automatically serialized through a shared mutex — emit them in the same turn and the harness FIFO-queues them.\n\n```\n# All three run concurrently\nbrowser_snapshot()\nbrowser_network_requests({ sinceMs: 5000 })\nbrowser_http_get({ url: \"...\" })\n```\n\n---\n\n## Tab ownership\n\nThe harness never touches tabs you didn't open through it. On first attach it spawns a dedicated Chrome window; subsequent `browser_new_tab` calls open inside that window. `browser_list_tabs` defaults to `scope:\"owned\"` (pass `scope:\"all\"` to see read-only listings of your other tabs); `browser_switch_tab` and `browser_close_tab` refuse non-owned tabs.\n\n---\n\n## Temporary scripts\n\nWhen the built-in tools aren't enough, write a script to disk and run it. Scripts get a `daemon` binding for direct CDP access — much faster than chaining tool calls.\n\n```js\nwrite(\"/tmp/scrape-pages.js\", `\n  const results = [];\n  for (const url of params.urls) {\n    await daemon.session().call(\"Page.navigate\", { url });\n    await new Promise(r =\u003e setTimeout(r, 2000));\n    const title = await daemon.evaluateJs(\"document.title\");\n    results.push({ url, title });\n  }\n  return { content: [{ type: \"text\", text: JSON.stringify(results, null, 2) }] };\n`)\n\nbrowser_run_script({ path: \"/tmp/scrape-pages.js\", params: { urls: [...] } })\n```\n\n**Bindings:** `params`, `daemon`, `require`, `signal`, `onUpdate`, `ctx`, `console`, `fetch`, `JSON`, `Buffer`, `setTimeout`, `clearTimeout`.\n\n`daemon` exposes: `evaluateJs`, `pageInfo`, `listTabs`, `switchTab`, `newTab`, `current`, and `session(targetId?)` for raw CDP via `session.call` / `session.callOnTarget` / `session.callBrowser` / `session.takeDialog`.\n\nScripts are written to disk — auditable and re-runnable.\n\n---\n\n## Commands\n\n| Command | Description |\n|---------|-------------|\n| `/browser-setup` | Connect pi to Chrome (run once) |\n| `/browser-status` | Show daemon health and current page |\n| `/browser-reload-daemon` | Restart the connection |\n\n---\n\n## What NOT to do\n\n- Don't launch your own browser — you're connected to the user's real Chrome\n- Don't type credentials — if you hit an auth wall, stop and ask\n- Don't screenshot to understand the page — `browser_snapshot` is the default\n- Don't screenshot to find click coordinates — `browser_snapshot`'s `@(x,y)` is exact\n- Don't screenshot to read a value — `browser_execute_js` is one round-trip\n- Don't ignore dialogs — check `browser_page_info` first\n\n---\n\n## Architecture\n\n```\npi agent → pi-browser-harness (TypeScript)\n               │ CDP WebSocket\n               ▼\n            Chrome\n```\n\nTemporary scripts run inside the harness process with full daemon and Node.js access.\n\n---\n\n## Troubleshooting\n\n| Symptom | Fix |\n|---------|-----|\n| `DevToolsActivePort not found` | Open `chrome://inspect/#remote-debugging`, tick the checkbox, click Allow |\n| Connection fails after Chrome restart | Run `/browser-reload-daemon` |\n| Page seems loaded but content is missing | SPA — call `browser_snapshot` again, or `browser_execute_js` for a specific element |\n| JS dialog is blocking actions | `browser_page_info` will report it — use `browser_handle_dialog` |\n| Daemon not starting | Run `/browser-setup` to re-run guided setup |\n| Snapshot didn't return `@(x,y)` for a target | Element wasn't recognized as interactive. Fall back to `browser_execute_js` with `getBoundingClientRect()` |\n\n---\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for setup, conventions, and PR process.\n\n---\n\n## Security\n\nThis extension drives your real Chrome. The agent can see open tabs, read page content, submit forms, and act inside authenticated sessions. `browser_run_script` evaluates JavaScript in the pi process with full `require` access — review any temporary scripts before executing them.\n\n---\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famankumarsingh77%2Fpi-browser-harness","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famankumarsingh77%2Fpi-browser-harness","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famankumarsingh77%2Fpi-browser-harness/lists"}