{"id":50730816,"url":"https://github.com/rodbland2021/agent-screenshot","last_synced_at":"2026-06-10T08:31:50.813Z","repository":{"id":345195921,"uuid":"1184874250","full_name":"rodbland2021/agent-screenshot","owner":"rodbland2021","description":"Playwright screenshot tool with Vision-optimised tiling + desktop screen capture for AI agents","archived":false,"fork":false,"pushed_at":"2026-03-18T05:34:33.000Z","size":29,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-18T18:56:54.082Z","etag":null,"topics":["ai-agents","claude-code","claude-vision","mcp","playwright","screen-capture","screenshot","vision"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rodbland2021.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-18T02:34:26.000Z","updated_at":"2026-03-18T05:34:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/rodbland2021/agent-screenshot","commit_stats":null,"previous_names":["rodbland2021/agent-screenshot"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/rodbland2021/agent-screenshot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rodbland2021%2Fagent-screenshot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rodbland2021%2Fagent-screenshot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rodbland2021%2Fagent-screenshot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rodbland2021%2Fagent-screenshot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rodbland2021","download_url":"https://codeload.github.com/rodbland2021/agent-screenshot/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rodbland2021%2Fagent-screenshot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34144679,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","claude-code","claude-vision","mcp","playwright","screen-capture","screenshot","vision"],"created_at":"2026-06-10T08:31:50.695Z","updated_at":"2026-06-10T08:31:50.798Z","avatar_url":"https://github.com/rodbland2021.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# agent-screenshot\n\n[![CI](https://github.com/rodbland2021/agent-screenshot/actions/workflows/ci.yml/badge.svg)](https://github.com/rodbland2021/agent-screenshot/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\n[![Version](https://img.shields.io/badge/version-1.0.0-blue)](CHANGELOG.md)\n\n**Two tools for giving AI agents eyes.** Automated web screenshots with Vision-optimised tiling, plus desktop screen capture.\n\n**[Blog post: Why these tools exist and how I use them](https://rodbland.com/blog/screenshot-and-grab-tools/)**\n\n```bash\n# Screenshot a web page (tiled for AI Vision models)\npython screenshot.py https://example.com --full-page\n\n# Capture your desktop screen\npython grab.py\n```\n\n## What's in the box\n\n| Tool | What it does |\n|------|-------------|\n| `screenshot.py` | Playwright-based web screenshots. Full-page captures are automatically tiled into 1072x1072 chunks optimised for Claude Vision and other multimodal models. |\n| `grab.py` | Desktop screen capture using Python's `mss` library. 14 region presets (halves, thirds, quadrants) for targeting specific parts of your display. |\n\n## Install\n\n```bash\ngit clone https://github.com/rodbland2021/agent-screenshot.git\ncd agent-screenshot\npip install -r requirements.txt\nplaywright install chromium\n```\n\nVerify it works:\n\n```bash\npython screenshot.py https://example.com\n# Should print a file path like /tmp/screenshots/example-com_1234567890.jpg\n```\n\n## Screenshot tool\n\nTakes automated screenshots of any URL using headless Chromium.\n\n```bash\n# Basic screenshot\npython screenshot.py https://example.com\n\n# Mobile viewport (375x812)\npython screenshot.py https://example.com --mobile\n\n# Full page, tiled for AI Vision models\npython screenshot.py https://example.com --full-page\n\n# Dismiss cookie banners and popups\npython screenshot.py https://example.com --dismiss-popups --wait-until load --wait 3000\n\n# Custom auth header\npython screenshot.py https://my-app.example.com --header \"Authorization=Bearer mytoken\"\n\n# Screenshot a specific element\npython screenshot.py https://example.com --selector \".hero-section\"\n```\n\n### Why 1072x1072 tiles?\n\nClaude Vision and similar multimodal models have a maximum effective resolution. A 1072x15000 pixel full-page screenshot is too large to process in detail — text blurs, UI elements become unreadable. Tiling the page into 1072x1072 chunks lets the model read every label, button, and table cell.\n\nThe tool handles this automatically with `--full-page`. A long page produces 4-8 tiles, each saved as a separate file.\n\n### Options\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--full-page` | off | Capture entire page, tile into 1072x1072 chunks |\n| `--mobile` | off | Mobile viewport (375x812) |\n| `--dismiss-popups` | off | Auto-close cookie banners, geo redirects, email popups |\n| `--selector` | — | CSS selector to screenshot a specific element |\n| `--wait N` | 0 | Extra wait in ms after page load (for SPAs) |\n| `--wait-until` | networkidle | `load`, `domcontentloaded`, or `networkidle` |\n| `--width` | 1072 | Viewport width |\n| `--height` | 1072 | Viewport height |\n| `--quality` | 85 | JPEG quality (1-100) |\n| `--png` | off | Output PNG instead of JPEG |\n| `--out` | /tmp/screenshots | Output directory |\n| `--max-height` | 15000 | Truncate pages taller than this |\n| `--header` | — | Custom HTTP header as `KEY=VALUE` (repeatable) |\n\n## Grab tool\n\nCaptures the physical desktop screen. Useful for sharing visual context that isn't a web page — design mockups, spreadsheets, error dialogs, anything on your monitor.\n\n\u003e **Requires a display.** `grab.py` needs X11 or Wayland (Linux), a desktop session (macOS/Windows), or a virtual framebuffer (`Xvfb`). It will not work on headless servers, Docker containers, or CI runners without a display server.\n\n```bash\n# Full screen\npython grab.py\n\n# Left half of screen\npython grab.py left\n\n# Top-right quadrant\npython grab.py top-right\n\n# Custom output path\npython grab.py --out /tmp/my-capture.jpg\n```\n\n### Regions\n\n`full`, `left`, `right`, `top`, `bottom`, `top-left`, `top-right`, `bottom-left`, `bottom-right`, `left-third`, `center-third`, `right-third`, `left-two-thirds`, `right-two-thirds`\n\n### Options\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--out` | /tmp/screen.jpg | Output file path |\n| `--quality` | 85 | JPEG quality (1-100) |\n| `--monitor` | 1 | Monitor number for multi-display setups |\n\n## MCP server (native agent integration)\n\nThe MCP server gives AI agents direct tool access — no shell commands needed. The agent calls `screenshot` or `grab` as native tools.\n\n### Claude Code\n\nAdd to `~/.claude.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-screenshot\": {\n      \"command\": \"python3\",\n      \"args\": [\"/path/to/agent-screenshot/mcp_server.py\"]\n    }\n  }\n}\n```\n\nRestart Claude Code. The agent now has `screenshot` and `grab` tools available natively.\n\n### OpenClaw / mcporter\n\nAdd to your MCP config (e.g. `~/.mcporter/mcporter.json`):\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-screenshot\": {\n      \"command\": \"python3\",\n      \"args\": [\"/path/to/agent-screenshot/mcp_server.py\"]\n    }\n  }\n}\n```\n\n### MCP tool reference\n\n| Tool | Parameters | Description |\n|------|-----------|-------------|\n| `screenshot` | `url`, `full_page`, `mobile`, `dismiss_popups`, `selector`, `wait_ms`, `quality`, `headers` | Screenshot a URL, optionally tiled for Vision |\n| `grab` | `region`, `output_path`, `quality`, `monitor` | Capture desktop screen |\n\n### Additional dependency\n\nThe MCP server requires the `mcp` package in addition to the base requirements:\n\n```bash\npip install mcp\n```\n\n## Add to your agent's instructions\n\nFor agents that don't support MCP (or as a fallback), add to your agent's instructions:\n\n**Claude Code** (`CLAUDE.md`):\n```markdown\nAfter any visual/UI change, verify with a screenshot before reporting done:\npython /path/to/screenshot.py \u003curl\u003e --full-page\nRead the resulting screenshots and check for visual regressions.\n```\n\n**Cursor** (`.cursorrules`), **Aider** (`.aider.conf.yml`), **[OpenClaw](https://github.com/openclaw/openclaw)** (agent system prompt) — same instruction, adapted to your config format.\n\n## Requirements\n\n- **Python 3.10+** on Linux, macOS, or Windows (including WSL)\n- **screenshot.py**: Playwright + Chromium. Works on any platform including headless servers, Docker, and CI.\n- **grab.py**: mss + Pillow. Requires a physical or virtual display (X11, Wayland, macOS desktop, or Windows).\n- **mcp_server.py** (optional): `mcp` package. For native tool integration with Claude Code and OpenClaw.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frodbland2021%2Fagent-screenshot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frodbland2021%2Fagent-screenshot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frodbland2021%2Fagent-screenshot/lists"}