{"id":35122623,"url":"https://github.com/abhinav-nigam/agent-browser","last_synced_at":"2026-01-12T08:21:02.613Z","repository":{"id":330925049,"uuid":"1123827767","full_name":"abhinav-nigam/agent-browser","owner":"abhinav-nigam","description":"Browser automation for AI agents - control browsers via simple CLI commands","archived":false,"fork":false,"pushed_at":"2025-12-29T09:24:53.000Z","size":739,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-01T00:07:27.043Z","etag":null,"topics":["ai-agents","automation","browser-automation","cli","playwright","python","testing","web-testing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abhinav-nigam.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-27T17:56:43.000Z","updated_at":"2025-12-29T09:24:56.000Z","dependencies_parsed_at":"2026-01-01T07:09:20.545Z","dependency_job_id":null,"html_url":"https://github.com/abhinav-nigam/agent-browser","commit_stats":null,"previous_names":["abhinav-nigam/agent-browser"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/abhinav-nigam/agent-browser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhinav-nigam%2Fagent-browser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhinav-nigam%2Fagent-browser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhinav-nigam%2Fagent-browser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhinav-nigam%2Fagent-browser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abhinav-nigam","download_url":"https://codeload.github.com/abhinav-nigam/agent-browser/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhinav-nigam%2Fagent-browser/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28337590,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T06:09:07.588Z","status":"ssl_error","status_checked_at":"2026-01-12T06:05:18.301Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","automation","browser-automation","cli","playwright","python","testing","web-testing"],"created_at":"2025-12-28T00:50:25.261Z","updated_at":"2026-01-12T08:21:02.604Z","avatar_url":"https://github.com/abhinav-nigam.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# agent-browser\n\nBrowser automation for AI agents. Control browsers via MCP (Model Context Protocol) or CLI.\n\n[![PyPI version](https://badge.fury.io/py/ai-agent-browser.svg)](https://badge.fury.io/py/ai-agent-browser)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n\n## Installation\n\n```bash\npip install ai-agent-browser\nplaywright install chromium\n```\n\n## Quick Start by AI Tool\n\nMost AI coding assistants support MCP (Model Context Protocol). Add agent-browser to your tool's config and the AI handles everything automatically.\n\n### Claude Code\n\n```bash\nclaude mcp add agent-browser -- agent-browser-mcp --allow-private\n```\n\nOr manually edit `~/.claude/claude_desktop_config.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### Claude Desktop\n\nEdit `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\\Claude\\claude_desktop_config.json` (Windows):\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### Cursor\n\nEdit `~/.cursor/mcp.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### Windsurf\n\nEdit `~/.codeium/windsurf/mcp_config.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### VS Code + Cline\n\nOpen Cline settings and add to MCP Servers:\n\n```json\n{\n  \"agent-browser\": {\n    \"command\": \"agent-browser-mcp\",\n    \"args\": [\"--allow-private\"]\n  }\n}\n```\n\n### gemini-cli\n\nEdit `~/.gemini/settings.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### OpenAI Codex CLI\n\n```bash\ncodex --mcp-config mcp.json\n```\n\nCreate `mcp.json` in your project:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    }\n  }\n}\n```\n\n### Aider (CLI Mode)\n\nAider doesn't support MCP yet. Use CLI mode instead:\n\n```bash\n# Add to your aider config or prompt:\n# \"You can control a browser using agent-browser CLI commands\"\n\n# In one terminal, start the browser:\nagent-browser start http://localhost:3000 --session dev\n\n# Aider can then run commands like:\nagent-browser cmd screenshot --session dev\nagent-browser cmd click \"#login\" --session dev\nagent-browser cmd fill \"#email\" \"test@example.com\" --session dev\n```\n\n### Other MCP Clients\n\nFor any MCP-compatible client, the server command is:\n\n```bash\nagent-browser-mcp [OPTIONS]\n\nOptions:\n  --allow-private  Allow localhost/private IPs (for local development)\n  --visible        Show browser window (for debugging)\n```\n\n## What Can It Do?\n\nagent-browser provides **74 browser automation tools** organized into categories:\n\n| Category | Tools | Examples |\n|----------|-------|----------|\n| **Navigation** | 5 | `goto`, `back`, `forward`, `reload`, `get_url` |\n| **Interactions** | 9 | `click`, `fill`, `type`, `select`, `press`, `hover`, `upload` |\n| **Waiting** | 6 | `wait_for`, `wait_for_text`, `wait_for_url`, `wait_for_change` |\n| **Data Extraction** | 6 | `screenshot`, `text`, `value`, `attr`, `count`, `evaluate` |\n| **Assertions** | 3 | `assert_visible`, `assert_text`, `assert_url` |\n| **Page State** | 5 | `scroll`, `viewport`, `cookies`, `storage`, `clear` |\n| **Debugging** | 3 | `console`, `network`, `dialog` |\n| **Agent Utilities** | 7 | `page_state`, `validate_selector`, `suggest_next_actions`, `browser_status` |\n| **Perception** | 3 | `get_page_markdown`, `get_accessibility_tree`, `find_relative` |\n| **Advanced** | 3 | `highlight`, `mock_network`, `clear_mocks` |\n| **Video Production** | 24 | `start_recording`, `generate_voiceover`, `spotlight`, `add_text_overlay`, `camera_zoom` |\n\n**For AI agents**: See [AGENT.md](AGENT.md) for a concise reference with selector syntax, common patterns, and tool safety levels.\n\n## Feature Showcase\n\n| **Research \u0026 Extraction** | **Cross-Page Audits** | **Data Operations** |\n| :--- | :--- | :--- |\n| ![Claude Demo](demo_claude_research.gif) | ![Gemini Demo](demo_gemini_audit.gif) | ![Interpreter Demo](demo_interpreter_data.gif) |\n| *Autonomous web research* | *Multi-page analysis* | *Table scraping to structured data* |\n\n## Common Use Cases\n\n### Test a Local Web App\n\n```\nYou: \"Test the login flow on my React app at localhost:3000\"\n\nAI: I'll test the login flow for you.\n    [navigates to localhost:3000]\n    [takes screenshot of login page]\n    [fills email and password fields]\n    [clicks submit button]\n    [waits for dashboard to load]\n    [asserts URL contains /dashboard]\n    [takes screenshot of result]\n\n    Login flow works correctly. The user is redirected to /dashboard after successful login.\n```\n\n### Scrape Data from a Website\n\n```\nYou: \"Extract the product names and prices from this page\"\n\nAI: I'll extract the product data.\n    [uses get_page_markdown to read structured content]\n    [or uses evaluate to run JavaScript extraction]\n\n    Found 24 products:\n    - Product A: $29.99\n    - Product B: $49.99\n    ...\n```\n\n### Debug a Visual Issue\n\n```\nYou: \"The submit button isn't visible on mobile viewport\"\n\nAI: Let me check that.\n    [sets viewport to 375x667 (iPhone)]\n    [navigates to the page]\n    [uses validate_selector to check button exists]\n    [uses highlight to show button location]\n    [takes screenshot]\n\n    The button exists but is positioned off-screen. Here's a screenshot\n    with the button highlighted. The CSS needs adjustment for mobile.\n```\n\n### Create a Marketing Video\n\n```\nYou: \"Create a demo video of our checkout flow with voiceover\"\n\nAI: I'll create a marketing video with narration.\n    [generates voiceovers for each scene]\n    [starts recording at 1920x1080]\n    [navigates to checkout page]\n    [moves cursor smoothly to cart button]\n    [adds annotation \"Click to checkout\"]\n    [camera zooms into form fields]\n    [fills form with human-like typing]\n    [stops recording]\n    [merges video with voiceover audio]\n\n    Created checkout_demo.mp4 (45 seconds) with synchronized narration.\n```\n\n## Cinematic Engine (Video Production)\n\nCreate marketing-grade videos with AI-controlled browser recordings, voiceovers, and post-production.\n\n### Installation\n\n```bash\npip install ai-agent-browser[video]\n```\n\n**Requirements:**\n- `ffmpeg` installed (for video processing) - https://ffmpeg.org/\n- `ELEVENLABS_API_KEY` for high-quality voiceover (recommended)\n- `OPENAI_API_KEY` for OpenAI TTS (alternative)\n- `JAMENDO_CLIENT_ID` for royalty-free music - https://devportal.jamendo.com/\n\n### Capabilities\n\n| Phase | Tools | Description |\n|-------|-------|-------------|\n| **Voice \u0026 Timing** | `generate_voiceover`, `get_audio_duration` | Generate TTS audio, get timing for sync |\n| **Recording** | `start_recording`, `stop_recording`, `recording_status` | Capture video with virtual cursor |\n| **Annotations** | `annotate`, `clear_annotations` | Floating text callouts |\n| **Spotlight Effects** | `spotlight`, `clear_spotlight` | Ring highlights, spotlight dimming, focus effects |\n| **Camera** | `camera_zoom`, `camera_pan`, `camera_reset` | Ken Burns-style zoom/pan effects |\n| **Post-Production** | `check_environment`, `get_video_duration` + ffmpeg | Use ffmpeg via shell (see examples below) |\n| **Stock Music** | `list_stock_music`, `download_stock_music` | Royalty-free music from Jamendo |\n| **Polish** | `smooth_scroll`, `type_human`, `set_presentation_mode` | Human-like interactions |\n\n### Complete Workflow Example\n\n```python\n# ============================================\n# PHASE 1: PREPARATION\n# ============================================\n\n# Check environment\ncheck_environment()  # Verify ffmpeg, API keys\n\n# Generate voiceover FIRST (timing drives everything)\nvo = generate_voiceover(\n    text=\"Welcome to our product. Watch as we explore the key features.\",\n    voice=\"H2JKG8QcEaH9iUMauArc\",   # Abhinav - warm, natural\n    provider=\"elevenlabs\",\n    stability=0.35,                 # More expressive (less robotic)\n    similarity_boost=0.6,           # Balanced clarity\n    style=0.3                       # Some emotion\n)\nvo_duration = get_audio_duration(vo[\"data\"][\"path\"])  # ~8 seconds\n\n# Find background music\ntracks = list_stock_music(query=\"corporate inspiring\", instrumental=True, speed=\"medium\")\nmusic = download_stock_music(url=tracks[\"data\"][\"tracks\"][0][\"download_url\"])\n\n# ============================================\n# PHASE 2: RECORDING\n# ============================================\n\n# Start recording at 1080p\nstart_recording(width=1920, height=1080)\nset_presentation_mode(enabled=True)  # Hide scrollbars\n\n# Navigate and add welcome annotation\ngoto(\"https://example.com\")\nwait(500)\nannotate(\"Welcome!\", style=\"dark\", position=\"top-right\")\nwait(2000)\n\n# Spotlight the main heading with focus effect\nspotlight(selector=\"h1\", style=\"focus\", color=\"#3b82f6\", dim_opacity=0.7)\nwait(3000)\nclear_spotlight()\n\n# Camera zoom on heading\ncamera_zoom(selector=\"h1\", level=1.5, duration_ms=1000)\nwait(1500)\ncamera_reset(duration_ms=800)\nwait(500)\n\n# Smooth scroll and highlight content\nclear_annotations()\nsmooth_scroll(direction=\"down\", amount=300, duration_ms=1000)\nwait(500)\n\n# Ring highlight on paragraph\nspotlight(selector=\"p\", style=\"ring\", color=\"#10b981\", pulse_ms=1200)\nannotate(\"Key information here\", style=\"light\", position=\"right\")\nwait(2000)\n\n# Cleanup and stop\nclear_spotlight()\nclear_annotations()\nstop_result = stop_recording()\n\n# ============================================\n# PHASE 3: POST-PRODUCTION (use ffmpeg via shell)\n# ============================================\n# NOTE: Use ffmpeg directly to avoid MCP timeout issues with long video operations.\n# check_environment() provides full ffmpeg command examples!\n\nraw_video = stop_result[\"data\"][\"path\"]  # WebM from recording\n\n# Run these commands in your shell/terminal:\n\n# 1. Convert WebM to MP4 (required for most editing)\n# ffmpeg -i recording.webm -c:v libx264 -preset fast -crf 23 -c:a aac output.mp4\n\n# 2. Merge voiceover (starts at 1 second)\n# ffmpeg -i output.mp4 -i voiceover.mp3 -filter_complex \"[1:a]adelay=1000|1000[a1]\" -map 0:v -map \"[a1]\" -c:v copy output_with_voice.mp4\n\n# 3. Add background music (15% volume)\n# ffmpeg -i output_with_voice.mp4 -i music.mp3 -filter_complex \"[1:a]volume=0.15[bg];[0:a][bg]amix=inputs=2:duration=first[aout]\" -map 0:v -map \"[aout]\" -c:v copy output_with_music.mp4\n\n# 4. Add title overlay (centered, 0-3 seconds with fade)\n# ffmpeg -i output_with_music.mp4 -vf \"drawtext=text='Product Demo':fontsize=72:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:enable='between(t,0,3)'\" -c:a copy final.mp4\n\n# Result: Professional 1080p video with voiceover, music, and title\n```\n\n### Spotlight Effects\n\nDraw attention to elements with cinematic highlighting:\n\n```python\n# Ring: Glowing pulsing border\nspotlight(selector=\"button.cta\", style=\"ring\", color=\"#3b82f6\", pulse_ms=1500)\n\n# Spotlight: Dims everything except the element\nspotlight(selector=\"#hero-title\", style=\"spotlight\", dim_opacity=0.7)\n\n# Focus: Ring + spotlight combined (maximum impact)\nspotlight(selector=\".feature-card\", style=\"focus\", color=\"#10b981\", dim_opacity=0.6)\n\n# Clear all effects\nclear_spotlight()\n```\n\n### Text Overlays (ffmpeg)\n\nAdd titles, captions, and annotations in post-production using ffmpeg:\n\n```bash\n# Centered title (visible 0-4 seconds)\nffmpeg -i input.mp4 -vf \"drawtext=text='Welcome to Our Demo':fontsize=64:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2:enable='between(t,0,4)'\" -c:a copy output.mp4\n\n# Top-positioned caption\nffmpeg -i input.mp4 -vf \"drawtext=text='Step 1':fontsize=48:fontcolor=white:x=(w-text_w)/2:y=50\" -c:a copy output.mp4\n```\n\n### Video Concatenation (ffmpeg)\n\nJoin multiple clips using ffmpeg:\n\n```bash\n# Create a list file\necho \"file 'scene1.mp4'\" \u003e list.txt\necho \"file 'scene2.mp4'\" \u003e\u003e list.txt\necho \"file 'scene3.mp4'\" \u003e\u003e list.txt\n\n# Concatenate\nffmpeg -f concat -safe 0 -i list.txt -c copy combined.mp4\n```\n\n### Virtual Cursor\n\nThe recording includes a virtual cursor with smooth, human-like movement:\n\n```javascript\n// Cursor is controlled via JavaScript injection\nwindow.__agentCursor.moveTo(x, y, duration_ms)  // Smooth move\nwindow.__agentCursor.click(x, y)                 // Click with ripple effect\n```\n\nThe cursor uses cubic-bezier easing for natural motion, not robotic linear movement.\n\n### Best Practices\n\n1. **Generate voiceover first** - Audio duration drives video pacing\n2. **Use `check_environment()`** - Get ffmpeg command examples and verify setup\n3. **Convert WebM to MP4** - `ffmpeg -i recording.webm -c:v libx264 -preset fast output.mp4`\n4. **Use `-c:v copy`** - Skip video re-encoding when possible (much faster)\n5. **Use presentation mode** - Hides scrollbars for cleaner visuals\n6. **Wait after effects** - Let animations complete before next action\n7. **Layer effects** - Combine spotlight + annotation for maximum impact\n8. **Keep music subtle** - Use `volume=0.15` in ffmpeg for background music\n9. **Use shell for ffmpeg** - Avoids MCP timeout issues with long operations\n10. **Add titles in post** - Text overlays are more flexible than annotations\n\nSee `examples/cinematic_full_demo.py` for a complete working example.\n\n## Security Features\n\nagent-browser is designed for safe use with AI agents:\n\n- **SSRF Protection**: Blocks dangerous schemes (`file://`, `javascript://`, `data://`) and private IPs by default\n- **DNS Rebinding Protection**: Resolved IPs are validated against private ranges\n- **Cloud Metadata Protection**: Blocks AWS/GCP metadata endpoints (169.254.169.254)\n- **Path Sandboxing**: File operations restricted to working directory\n- **Credential Rejection**: URLs with embedded `user:pass` are blocked\n- **Sensitive Field Masking**: Password fields masked in `page_state` output\n\nUse `--allow-private` only when testing local development servers.\n\n## Advanced Configuration\n\n### Dual Instances (Headless + Visible)\n\nRun two browser instances - one for speed, one for debugging:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-browser\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\"]\n    },\n    \"agent-browser-visible\": {\n      \"command\": \"agent-browser-mcp\",\n      \"args\": [\"--allow-private\", \"--visible\"]\n    }\n  }\n}\n```\n\n### Configuration Options\n\n| Use Case | Args |\n|----------|------|\n| Production (SSRF protected) | `[]` |\n| Local development | `[\"--allow-private\"]` |\n| Debugging (visible browser) | `[\"--allow-private\", \"--visible\"]` |\n\n## CLI Mode\n\nFor tools that don't support MCP, or for scripting:\n\n### Basic Usage\n\n```bash\n# Terminal 1: Start browser (blocks while running)\nagent-browser start http://localhost:8080\n\n# Terminal 2: Send commands\nagent-browser cmd screenshot home\nagent-browser cmd click \"#submit\"\nagent-browser cmd fill \"#email\" \"test@example.com\"\nagent-browser cmd assert_visible \".success\"\n\n# When done\nagent-browser stop\n```\n\n### Session Management\n\nRun multiple browsers concurrently:\n\n```bash\n# Start separate sessions\nagent-browser start http://localhost:3000 --session app1\nagent-browser start http://localhost:4000 --session app2\n\n# Commands target specific sessions\nagent-browser cmd screenshot --session app1\nagent-browser cmd click \"#btn\" --session app2\n\n# Stop individually\nagent-browser stop --session app1\n```\n\n### Interactive Mode\n\nREPL for manual testing:\n\n```bash\nagent-browser interact http://localhost:8080\n\n\u003e screenshot initial\n\u003e click #login\n\u003e fill #email \"test@example.com\"\n\u003e assert_visible .dashboard\n\u003e quit\n```\n\n### CLI Command Reference\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand full CLI reference\u003c/summary\u003e\n\n#### Browser Control\n| Command | Description |\n|---------|-------------|\n| `start \u003curl\u003e` | Start browser session |\n| `start \u003curl\u003e --visible` | Start with visible window |\n| `stop` | Close browser |\n| `status` | Check if browser running |\n\n#### Navigation\n| Command | Description |\n|---------|-------------|\n| `cmd goto \u003curl\u003e` | Navigate to URL |\n| `cmd back` | Go back |\n| `cmd forward` | Go forward |\n| `cmd reload` | Reload page |\n\n#### Interactions\n| Command | Description |\n|---------|-------------|\n| `cmd click \u003cselector\u003e` | Click element |\n| `cmd fill \u003cselector\u003e \u003ctext\u003e` | Fill input |\n| `cmd type \u003cselector\u003e \u003ctext\u003e` | Type with key events |\n| `cmd select \u003cselector\u003e \u003cvalue\u003e` | Select dropdown |\n| `cmd press \u003ckey\u003e` | Press key (Enter, Tab, etc.) |\n| `cmd scroll \u003cdirection\u003e` | Scroll (up/down/top/bottom) |\n\n#### Screenshots \u0026 Data\n| Command | Description |\n|---------|-------------|\n| `cmd screenshot [name]` | Take screenshot |\n| `cmd text \u003cselector\u003e` | Get text content |\n| `cmd value \u003cselector\u003e` | Get input value |\n| `cmd count \u003cselector\u003e` | Count elements |\n\n#### Assertions\n| Command | Description |\n|---------|-------------|\n| `cmd assert_visible \u003csel\u003e` | Check visibility |\n| `cmd assert_text \u003csel\u003e \u003ctext\u003e` | Check text content |\n| `cmd assert_url \u003cpattern\u003e` | Check URL |\n\n#### Debugging\n| Command | Description |\n|---------|-------------|\n| `cmd console` | View JS console |\n| `cmd network` | View network log |\n| `cmd wait \u003cms\u003e` | Wait milliseconds |\n| `cmd wait_for \u003cselector\u003e` | Wait for element |\n\n\u003c/details\u003e\n\n## Architecture\n\n```\n┌─────────────────┐      MCP/JSON-RPC       ┌─────────────────┐\n│   AI Assistant  │ ◄──────────────────────►│  agent-browser  │\n│ (Claude, Cursor,│                         │   MCP Server    │\n│  Gemini, etc.)  │                         │                 │\n└─────────────────┘                         └────────┬────────┘\n                                                     │\n                                                     ▼\n                                            ┌─────────────────┐\n                                            │   Playwright    │\n                                            │   (Chromium)    │\n                                            └─────────────────┘\n```\n\nThe MCP server manages browser lifecycle automatically. For CLI mode, a file-based IPC system coordinates between the CLI process and a persistent browser process.\n\n## Troubleshooting\n\n| Problem | Solution |\n|---------|----------|\n| \"Private IP blocked\" | Add `--allow-private` for localhost testing |\n| \"Element not found\" | Use `validate_selector` to check selector |\n| \"Timeout waiting\" | Increase timeout or use `wait_for` first |\n| Browser not responding | Check `browser_status` or restart |\n| MCP not connecting | Verify config path and restart AI tool |\n\n## Python API\n\n```python\nfrom agent_browser import BrowserDriver\n\ndriver = BrowserDriver(session_id=\"test\")\nresult = driver.send_command(\"screenshot home\")\nprint(result)\n```\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n## License\n\nGNU General Public License v3.0 - see [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabhinav-nigam%2Fagent-browser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabhinav-nigam%2Fagent-browser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabhinav-nigam%2Fagent-browser/lists"}