{"id":50796866,"url":"https://github.com/smouj/agent-browser","last_synced_at":"2026-06-12T15:01:57.954Z","repository":{"id":353337320,"uuid":"1218985026","full_name":"smouj/agent-browser","owner":"smouj","description":"Give any AI agent a real browser. REST API + Web Dashboard + Vision AI. Control browsers via Playwright from OpenClaw, Hermes, or any LLM. Built with Next.js and TypeScript.","archived":false,"fork":false,"pushed_at":"2026-05-02T13:37:23.000Z","size":9776,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-02T15:30:01.133Z","etag":null,"topics":["accessibility-tree","agent-framework","ai-agent","browser-automation","computer-vision","headless-browser","hermes","llm","mcp","nextjs","open-source","openclaw","playwright","rest-api","screenshot","typescript","vision-ai","web-automation","web-scraping"],"latest_commit_sha":null,"homepage":"https://smouj.github.io/agent-browser/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/smouj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-23T12:15:02.000Z","updated_at":"2026-05-02T13:37:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/smouj/agent-browser","commit_stats":null,"previous_names":["smouj/agent-browser"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/smouj/agent-browser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smouj%2Fagent-browser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smouj%2Fagent-browser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smouj%2Fagent-browser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smouj%2Fagent-browser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/smouj","download_url":"https://codeload.github.com/smouj/agent-browser/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/smouj%2Fagent-browser/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34249561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-12T02:00:06.859Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility-tree","agent-framework","ai-agent","browser-automation","computer-vision","headless-browser","hermes","llm","mcp","nextjs","open-source","openclaw","playwright","rest-api","screenshot","typescript","vision-ai","web-automation","web-scraping"],"created_at":"2026-06-12T15:01:50.555Z","updated_at":"2026-06-12T15:01:57.945Z","avatar_url":"https://github.com/smouj.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg width=\"754\" height=\"754\" alt=\"57401968-e100-42a6-9014-893721b9d23e\" src=\"https://github.com/user-attachments/assets/939750cb-18aa-49e0-ac28-af0a8a0c1d0f\" /\u003e\n\n# AgentBrowser\n\n**AI-Powered Browser Automation Platform**\n\n[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![v1.0.0](https://img.shields.io/badge/version-1.0.0-green.svg)](https://github.com/smouj/agent-browser/releases)\n[![Next.js 16](https://img.shields.io/badge/Next.js-16-black)](https://nextjs.org/)\n[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-3178C6)](https://www.typescriptlang.org/)\n[![Playwright](https://img.shields.io/badge/Playwright-1.59-2EAD33)](https://playwright.dev/)\n[![Tailwind CSS 4](https://img.shields.io/badge/Tailwind_CSS-4-06B6D4)](https://tailwindcss.com/)\n[![Prisma](https://img.shields.io/badge/Prisma-6.x-2D3748)](https://www.prisma.io/)\n\n*Give any AI agent a real browser. REST API + Web Dashboard + Vision AI.*\n\n[Documentation](https://smouj.github.io/agent-browser) \u0026middot; [Quick Start](#-quick-start) \u0026middot; [API Docs](#-rest-api) \u0026middot; [AI Agent Guide](#-using-with-ai-agents) \u0026middot; [Architecture](#-architecture)\n\n\u003cimg src=\"assets/screenshot-wiki.png\" alt=\"AgentBrowser browsing Wikipedia\" width=\"100%\" style=\"border-radius:12px;border:1px solid #333\"\u003e\n\n\u003c/div\u003e\n\n---\n\n## What is AgentBrowser?\n\nAgentBrowser is an open-source browser automation platform designed to be the **hands and eyes of AI agents**. It provides a REST API and real-time WebSocket interface for controlling real browser instances — letting LLMs, CLIs, and autonomous agents navigate the web, interact with pages, extract data, and perform complex multi-step tasks.\n\nBuilt on **Next.js 16**, **Playwright**, and **TypeScript**, it ships with a professional web dashboard for real-time monitoring and control, a comprehensive REST API for programmatic access, and a built-in Vision AI system that provides screenshots, simplified DOM, accessibility trees, and interactive element detection.\n\n**Why AgentBrowser?**\n\n- Most browser automation tools are designed for testing. AgentBrowser is designed for **AI agents**.\n- Clean REST API that any LLM can call — no browser-specific knowledge required.\n- Vision AI reduces web pages to structured data that language models can reason about.\n- Session persistence means agents can pick up where they left off.\n- Compatible with **OpenClaw**, **Hermes**, **OpenAI Function Calling**, and any custom agent framework.\n\n---\n\n## Features\n\n### Browser Engine\n- Multi-browser support — **Chromium**, **Firefox**, and **WebKit** via Playwright\n- Session management — Persistent cookies, localStorage, and browser state across requests\n- Headless \u0026 headed modes — Run headless on servers, headed for debugging\n- Configurable viewports — Custom screen sizes and resolutions\n- Proxy support — Route traffic through HTTP/HTTPS proxies with authentication\n\n### 25+ Browser Actions\n- **Navigation** — `navigate`, `goBack`, `goForward`, `reload`\n- **Mouse** — `click`, `dblclick`, `hover`, `rightClick`\n- **Keyboard** — `type`, `press`, `select`\n- **Scrolling** — `scroll` with direction and element targeting\n- **Waiting** — `wait`, `waitForSelector`, `waitForNavigation`\n- **Screenshots** — Full page, element-specific, PNG/JPEG\n- **JavaScript** — `evaluate` arbitrary JS in the page context\n- **Cookies** — `getCookies`, `setCookies`, `clearCookies`\n- **Storage** — `getLocalStorage`, `setLocalStorage`, `clearLocalStorage`\n- **Info** — `getUrl`, `getTitle`, `getContent`\n\n### Vision AI\n\n\u003cimg src=\"assets/screenshot-vision.png\" alt=\"Vision AI Panel\" width=\"100%\" style=\"border-radius:12px;border:1px solid #333\"\u003e\n\n- **Screenshots** — Base64 PNG/JPEG screenshots on demand\n- **Simplified DOM** — Cleaned HTML with interactive elements highlighted\n- **Accessibility tree** — Full a11y tree for screen-reader-style understanding\n- **Interactive element detection** — Auto-detects buttons, links, inputs, and more with selectors and coordinates\n- **Page metadata** — Title, description, OG tags, favicon, language\n\n### Developer Experience\n- **REST API** — 8 clean JSON endpoints compatible with any LLM, CLI, or SDK\n- **Real-time WebSocket** — Live updates on actions, screenshots, and session changes via Socket.IO\n- **Web Dashboard** — Professional dark-themed UI with session management, live preview, and action logs\n- **TypeScript** — End-to-end type safety with exported types\n- **SQLite + Prisma** — Zero-config persistence for sessions and action logs\n- **Action logging** — Every browser action is recorded with timing and results\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n- **Node.js** 18+ or **Bun** 1.x\n- **Playwright browsers** (installed automatically via postinstall)\n\n### Install\n\n```bash\n# Clone the repository\ngit clone https://github.com/smouj/agent-browser.git\ncd agent-browser\n\n# Install dependencies\nbun install\n# or: npm install\n\n# Set up the database\nbun run db:push\n# or: npx prisma db push\n\n# Install Playwright browsers (if not already installed)\nbunx playwright install chromium\n```\n\n### Run\n\n```bash\n# Development mode (hot reload, port 3000)\nbun run dev\n\n# Production mode\nbun run build\nbun run start\n```\n\nOpen [http://localhost:3000](http://localhost:3000) to access the web dashboard.\n\n### One-Line Setup\n\n```bash\ngit clone https://github.com/smouj/agent-browser.git \u0026\u0026 cd agent-browser \u0026\u0026 bun install \u0026\u0026 bun run db:push \u0026\u0026 bunx playwright install chromium \u0026\u0026 bun run dev\n```\n\n---\n\n## AI Agent Integration Guide\n\n### How It Works\n\nAgentBrowser follows a simple **observe-think-act loop**:\n\n```\n1. CREATE SESSION  → POST /api/browser/sessions\n2. NAVIGATE        → POST /sessions/{id}/action  { action: \"navigate\" }\n3. OBSERVE         → POST /sessions/{id}/vision   (screenshot + DOM + elements)\n4. THINK           → LLM analyzes vision data and decides next action\n5. ACT             → POST /sessions/{id}/action  { action: \"click/type/scroll\" }\n6. REPEAT 3-5      → Until task is complete\n7. CLEANUP         → DELETE /sessions/{id}\n```\n\n### OpenClaw Integration\n\nOpenClaw uses tool-calling to interact with external services. Register AgentBrowser as a set of tools:\n\n**1. Start AgentBrowser:**\n```bash\nbun run dev  # http://localhost:3000\n```\n\n**2. Create a tool definition** in your OpenClaw project (`tools/browser.yaml`):\n```yaml\nname: browser_navigate\ndescription: \"Navigate the browser to a URL\"\nendpoint: \"http://localhost:3000/api/browser/sessions/{session_id}/action\"\nmethod: POST\nparameters:\n  session_id:\n    type: string\n    description: \"Active browser session ID\"\n  action:\n    type: string\n    default: \"navigate\"\n  target:\n    type: string\n    description: \"URL to navigate to\"\n```\n\n**3. Register all tools:** `browser_navigate`, `browser_click`, `browser_type`, `browser_vision`, `browser_screenshot`, `browser_scroll`\n\n**4. Create a session at startup** and pass the `session_id` to all tool calls\n\n**5. Use Vision AI** to let the agent \"see\" the page before deciding what to do\n\n### Hermes Integration\n\nHermes supports MCP (Model Context Protocol) servers. Add AgentBrowser to your Hermes config:\n\n```yaml\n# hermes.config.yaml\nmcp_servers:\n  agentbrowser:\n    type: \"rest\"\n    base_url: \"http://localhost:3000/api/browser\"\n    tools:\n      - name: \"create_session\"\n        path: \"/sessions\"\n        method: \"POST\"\n      - name: \"execute_action\"\n        path: \"/sessions/{session_id}/action\"\n        method: \"POST\"\n      - name: \"get_vision\"\n        path: \"/sessions/{session_id}/vision\"\n        method: \"POST\"\n      - name: \"close_session\"\n        path: \"/sessions/{session_id}\"\n        method: \"DELETE\"\n```\n\n### OpenAI Function Calling\n\nDefine AgentBrowser as an OpenAI tool:\n\n```python\ntools = [{\n    \"type\": \"function\",\n    \"function\": {\n        \"name\": \"browser_navigate\",\n        \"description\": \"Navigate browser to a URL\",\n        \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"url\": {\"type\": \"string\"}\n            },\n            \"required\": [\"url\"]\n        }\n    }\n}, {\n    \"type\": \"function\",\n    \"function\": {\n        \"name\": \"browser_vision\",\n        \"description\": \"Get AI vision snapshot - screenshot, DOM, interactive elements\",\n        \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"full_page\": {\"type\": \"boolean\", \"default\": False}\n            }\n        }\n    }\n}, {\n    \"type\": \"function\",\n    \"function\": {\n        \"name\": \"browser_click\",\n        \"description\": \"Click an element on the page using a CSS selector\",\n        \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"selector\": {\"type\": \"string\"}\n            },\n            \"required\": [\"selector\"]\n        }\n    }\n}]\n```\n\n### Python Agent Example\n\n```python\nimport requests\n\nBASE = \"http://localhost:3000/api/browser/sessions\"\n\n# 1. Create session\ns = requests.post(BASE, json={\n    \"name\": \"python-agent\",\n    \"browserType\": \"chromium\",\n    \"headless\": True\n}).json()\nsid = s[\"id\"]\n\n# 2. Navigate\nrequests.post(f\"{BASE}/{sid}/action\", json={\n    \"action\": \"navigate\",\n    \"target\": \"https://news.ycombinator.com\"\n})\n\n# 3. Get vision (for LLM)\nvision = requests.post(f\"{BASE}/{sid}/vision\", json={}).json()\n\n# 4. Pass interactive elements to your LLM\nfor el in vision[\"interactiveElements\"]:\n    print(f\"{el['type']}: {el['text']} → {el['selector']}\")\n\n# 5. Clean up\nrequests.delete(f\"{BASE}/{sid}\")\n```\n\n---\n\n## REST API\n\nAll endpoints return JSON. Base URL: `http://localhost:3000`\n\n### Sessions\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `POST` | `/api/browser/sessions` | Create a new browser session |\n| `GET` | `/api/browser/sessions` | List all sessions |\n| `GET` | `/api/browser/sessions/:id` | Get session details |\n| `DELETE` | `/api/browser/sessions/:id` | Close and delete a session |\n| `POST` | `/api/browser/sessions/:id/action` | Execute a browser action |\n| `POST` | `/api/browser/sessions/:id/vision` | Get AI vision snapshot |\n| `GET/POST` | `/api/browser/sessions/:id/cookies` | Get or set cookies |\n| `GET` | `/api/browser/sessions/:id/logs` | Get paginated action logs |\n\n### Create a Session\n\n```bash\ncurl -X POST http://localhost:3000/api/browser/sessions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"name\": \"my-agent-session\",\n    \"browserType\": \"chromium\",\n    \"headless\": true\n  }'\n```\n\n**Request body:**\n\n| Field | Type | Default | Description |\n|-------|------|---------|-------------|\n| `name` | `string` | Auto-generated | Session name |\n| `browserType` | `\"chromium\" \\| \"firefox\" \\| \"webkit\"` | `\"chromium\"` | Browser engine |\n| `headless` | `boolean` | `true` | Run without UI |\n| `proxy` | `object` | `undefined` | Proxy config (`server`, `username?`, `password?`) |\n| `viewport` | `object` | `{width: 1280, height: 720}` | Viewport size |\n| `userAgent` | `string` | Browser default | Custom user agent |\n| `locale` | `string` | `\"en-US\"` | Browser locale |\n| `timezone` | `string` | `\"America/New_York\"` | Browser timezone |\n\n### Execute an Action\n\n```bash\n# Navigate\ncurl -X POST http://localhost:3000/api/browser/sessions/{id}/action \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"action\":\"navigate\",\"target\":\"https://example.com\"}'\n\n# Click\ncurl -X POST http://localhost:3000/api/browser/sessions/{id}/action \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"action\":\"click\",\"target\":\"button#submit\"}'\n\n# Type\ncurl -X POST http://localhost:3000/api/browser/sessions/{id}/action \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"action\":\"type\",\"target\":\"input#search\",\"value\":\"hello\",\"options\":{\"pressEnter\":true}}'\n\n# Screenshot\ncurl -X POST http://localhost:3000/api/browser/sessions/{id}/action \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"action\":\"screenshot\",\"options\":{\"fullPage\":true}}'\n```\n\n### Get Vision Snapshot\n\n```bash\ncurl -X POST http://localhost:3000/api/browser/sessions/{id}/vision \\\n  -H \"Content-Type: application/json\" \\\n  -d '{}'\n```\n\nReturns: screenshot (base64), simplified DOM, accessibility tree, interactive elements with selectors, and page metadata.\n\n### Supported Actions\n\n| Category | Action | `target` | `value` | `options` |\n|----------|--------|----------|---------|-----------|\n| **Navigation** | `navigate` | URL | — | `waitUntil`, `timeout` |\n| | `goBack` | — | — | — |\n| | `goForward` | — | — | — |\n| | `reload` | — | — | `waitUntil` |\n| **Mouse** | `click` | CSS selector | — | `button`, `clickCount`, `delay` |\n| | `dblclick` | CSS selector | — | `delay` |\n| | `hover` | CSS selector | — | — |\n| | `rightClick` | CSS selector | — | `delay` |\n| **Keyboard** | `type` | CSS selector | Text to type | `clear`, `pressEnter`, `delay` |\n| | `press` | — | Key name | — |\n| | `select` | CSS selector | Option value | — |\n| **Scroll** | `scroll` | Direction | Pixel amount | `element` |\n| **Wait** | `wait` | — | Milliseconds | — |\n| | `waitForSelector` | CSS selector | — | `state`, `timeout` |\n| | `waitForNavigation` | — | — | `waitUntil`, `timeout` |\n| **Capture** | `screenshot` | — | — | `fullPage`, `element`, `quality`, `type` |\n| **JS** | `evaluate` | — | JS expression | — |\n| **Cookies** | `getCookies` | — | — | — |\n| | `setCookies` | — | JSON cookies array | — |\n| | `clearCookies` | — | — | — |\n| **Storage** | `getLocalStorage` | — | — | — |\n| | `setLocalStorage` | — | JSON key-value pairs | — |\n| | `clearLocalStorage` | — | — | — |\n| **Info** | `getUrl` | — | — | — |\n| | `getTitle` | — | — | — |\n| | `getContent` | — | — | — |\n\n---\n\n## WebSocket Events\n\nAgentBrowser emits real-time events via Socket.IO for live dashboards and reactive agent loops.\n\n```javascript\nconst socket = io('http://localhost:3000');\n\nsocket.on('action', (event) =\u003e {\n  console.log(`[${event.sessionId}] ${event.action} → ${event.result.success ? 'OK' : 'FAIL'}`);\n});\n\nsocket.on('session_update', (event) =\u003e {\n  console.log(`Session ${event.sessionId}: ${event.data.currentUrl}`);\n});\n\nsocket.on('screenshot', (event) =\u003e {\n  const img = Buffer.from(event.screenshot, 'base64');\n});\n```\n\n**Event types:** `session_created`, `session_update`, `session_closed`, `action`, `screenshot`\n\n---\n\n## Configuration\n\n### Environment Variables\n\nCreate a `.env` file in the project root:\n\n```env\n# Database (SQLite)\nDATABASE_URL=\"file:./custom.db\"\n\n# Server\nPORT=3000\nNODE_ENV=\"development\"\n\n# Browser defaults\nDEFAULT_BROWSER_TYPE=\"chromium\"\nDEFAULT_HEADLESS=true\nDEFAULT_VIEWPORT_WIDTH=1280\nDEFAULT_VIEWPORT_HEIGHT=720\n\n# Session limits\nSESSION_TIMEOUT_MS=3600000\nMAX_CONCURRENT_SESSIONS=10\n```\n\n---\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────┐\n│                    Clients                          │\n│         LLMs · CLIs · Web Dashboard                │\n└──────────────┬──────────────────┬──────────────────┘\n               │  REST API        │  WebSocket\n┌──────────────▼──────────────────▼──────────────────┐\n│              Next.js API Routes                     │\n│    /api/browser/sessions/*                         │\n└──────────────┬─────────────────────────────────────┘\n               │\n┌──────────────▼─────────────────────────────────────┐\n│           Browser Engine Layer                      │\n│    Session Manager · Action Executor · Vision       │\n└──────────────┬─────────────────────────────────────┘\n               │\n┌──────────────▼─────────────────────────────────────┐\n│              Playwright                             │\n│    Chromium · Firefox · WebKit                     │\n└────────────────────────────────────────────────────┘\n               │\n┌──────────────▼─────────────────────────────────────┐\n│          SQLite (via Prisma)                        │\n│    Sessions · Action Logs                           │\n└────────────────────────────────────────────────────┘\n```\n\n| Module | Path | Description |\n|--------|------|-------------|\n| API Routes | `src/app/api/browser/sessions/` | REST endpoints for session and action management |\n| Browser Engine | `src/lib/browser/engine.ts` | Singleton session lifecycle manager |\n| Action Executor | `src/lib/browser/actions.ts` | 25+ browser actions with logging |\n| Vision System | `src/lib/browser/vision.ts` | Screenshots, DOM simplification, a11y tree |\n| Types | `src/lib/browser/types.ts` | TypeScript interfaces for all API types |\n| Database | `prisma/schema.prisma` | Session and action log persistence |\n| Dashboard | `src/app/page.tsx` | Web UI with session sidebar and live updates |\n\n---\n\n## Tech Stack\n\n| Technology | Purpose |\n|------------|---------|\n| [Next.js 16](https://nextjs.org/) | Full-stack React framework, API routes |\n| [TypeScript](https://www.typescriptlang.org/) | End-to-end type safety |\n| [Tailwind CSS 4](https://tailwindcss.com/) | Utility-first styling |\n| [shadcn/ui](https://ui.shadcn.com/) | UI component library |\n| [Playwright](https://playwright.dev/) | Browser automation engine |\n| [Prisma](https://www.prisma.io/) | Type-safe database ORM (SQLite) |\n| [Socket.IO](https://socket.io/) | Real-time WebSocket communication |\n| [Zustand](https://zustand.docs.pmnd.rs/) | Client-side state management |\n\n---\n\n## Project Structure\n\n```\nagent-browser/\n├── assets/                         # Logo, screenshots, OG image\n├── docs/                           # GitHub Pages documentation\n│   └── index.html                  # Full documentation site\n├── src/\n│   ├── app/\n│   │   ├── api/browser/sessions/\n│   │   │   ├── route.ts            # POST (create), GET (list)\n│   │   │   └── [id]/\n│   │   │       ├── route.ts        # GET (detail), DELETE (close)\n│   │   │       ├── action/route.ts # POST (execute action)\n│   │   │       ├── vision/route.ts # POST (vision snapshot)\n│   │   │       ├── cookies/route.ts# GET, POST (cookies)\n│   │   │       └── logs/route.ts   # GET (action logs)\n│   │   ├── layout.tsx\n│   │   ├── page.tsx                # Web dashboard\n│   │   └── globals.css\n│   ├── components/ui/              # shadcn/ui components\n│   ├── lib/\n│   │   ├── browser/\n│   │   │   ├── engine.ts           # Session lifecycle manager\n│   │   │   ├── actions.ts          # 25+ action executor\n│   │   │   ├── vision.ts           # Vision AI system\n│   │   │   ├── session.ts          # Session helpers\n│   │   │   ├── types.ts            # TypeScript interfaces\n│   │   │   └── utils.ts            # Utility functions\n│   │   ├── db.ts                   # Prisma client\n│   │   └── utils.ts                # General utilities\n│   └── hooks/                      # React hooks\n├── prisma/\n│   └── schema.prisma               # Database schema\n├── public/                         # Static assets\n└── package.json\n```\n\n---\n\n## Comparison with Alternatives\n\nAgentBrowser vs other AI browser automation tools:\n\n| Feature | AgentBrowser | Browser Use | Stagehand | Playwright MCP |\n|---------|-------------|-------------|----------|----------------|\n| Self-hosted | Yes | Yes | No (cloud) | Yes |\n| REST API | 8 endpoints | No | SDK only | MCP only |\n| Multi-browser | Chromium, Firefox, WebKit | Chromium only | Chromium only | All 3 |\n| Vision AI | Screenshots, DOM, a11y tree, elements | Screenshot only | DOM only | None |\n| Session persistence | Cookies, localStorage, DB | Memory only | No | No |\n| Web Dashboard | Yes (dark theme) | No | No | No |\n| WebSocket events | Yes | No | No | No |\n| 25+ browser actions | Yes | Limited | Limited | Basic |\n| Agent agnostic | Any LLM/CLI | Python only | JS/Python | Any MCP client |\n| License | MIT | MIT | Commercial | MIT |\n| Database | SQLite + Prisma | None | Cloud | None |\n\n**Notable alternatives not in the table:**\n\n- **[LaVague](https://lavague.ai)** — RAG-based, research-focused approach with no REST API.\n- **[Skyvern](https://skyvern.com)** — Commercial cloud platform with no self-hosting option, workflow-only.\n- **WebVoyager** — Academic/research project, not production-ready.\n\nAgentBrowser is the only open-source solution that combines a full REST API, Vision AI system, session persistence, AND a web dashboard — making it the most complete platform for AI browser automation.\n\n---\n\n## Contributing\n\nContributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) before submitting a pull request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. 2026 \n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Built with Next.js, Playwright, and TypeScript**\n\nMade for AI agents, by developers who build with AI agents.\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmouj%2Fagent-browser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsmouj%2Fagent-browser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsmouj%2Fagent-browser/lists"}