{"id":44834167,"url":"https://github.com/n1byn1kt/apitap","last_synced_at":"2026-03-09T02:03:17.970Z","repository":{"id":338481986,"uuid":"1157993273","full_name":"n1byn1kt/apitap","owner":"n1byn1kt","description":"The MCP server that turns any website into an API — no docs, no SDK, no browser. npm: @apitap/core","archived":false,"fork":false,"pushed_at":"2026-03-06T04:31:29.000Z","size":13319,"stargazers_count":71,"open_issues_count":1,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-03-06T08:30:21.067Z","etag":null,"topics":["ai-agent","api","browser-automation","mcp","mcp-server","playwright","skill-file","web-scraping"],"latest_commit_sha":null,"homepage":"https://www.apitap.io","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/n1byn1kt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-14T16:20:58.000Z","updated_at":"2026-03-06T05:55:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/n1byn1kt/apitap","commit_stats":null,"previous_names":["n1byn1kt/apitap"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/n1byn1kt/apitap","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/n1byn1kt%2Fapitap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/n1byn1kt%2Fapitap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/n1byn1kt%2Fapitap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/n1byn1kt%2Fapitap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/n1byn1kt","download_url":"https://codeload.github.com/n1byn1kt/apitap/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/n1byn1kt%2Fapitap/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30231466,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-07T19:01:10.287Z","status":"ssl_error","status_checked_at":"2026-03-07T18:59:58.103Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","api","browser-automation","mcp","mcp-server","playwright","skill-file","web-scraping"],"created_at":"2026-02-17T01:05:08.157Z","updated_at":"2026-03-09T02:03:17.909Z","avatar_url":"https://github.com/n1byn1kt.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ApiTap\n\n[![npm version](https://img.shields.io/npm/v/@apitap/core)](https://www.npmjs.com/package/@apitap/core)\n[![tests](https://img.shields.io/badge/tests-1051%20passing-brightgreen)](https://github.com/n1byn1kt/apitap)\n[![license](https://img.shields.io/badge/license-BSL--1.1-blue)](./LICENSE)\n\n**The MCP server that turns any website into an API — no docs, no SDK, no browser.**\n\nApiTap is an MCP server that lets AI agents browse the web through APIs instead of browsers. When an agent needs data from a website, ApiTap automatically detects the site's framework (WordPress, Next.js, Shopify, etc.), discovers its internal API endpoints, and calls them directly — returning clean JSON instead of forcing the agent to render and parse HTML. For sites that need authentication, it opens a browser window for a human to log in, captures the session tokens, and hands control back to the agent. Every site visited generates a reusable \"skill file\" that maps the site's APIs, so the first visit is a discovery step and every subsequent visit is a direct, instant API call. It works with any MCP-compatible LLM client and reduces token costs by 20-100x compared to browser automation.\n\nThe web was built for human eyes; ApiTap makes it native to machines.\n\n```bash\n# One tool call: discover the API + replay it\napitap browse https://techcrunch.com\n  ✓ Discovery: WordPress detected (medium confidence)\n  ✓ Replay: GET /wp-json/wp/v2/posts → 200 (10 articles)\n\n# Or read content directly — no browser needed\napitap read https://en.wikipedia.org/wiki/Node.js\n  ✓ Wikipedia decoder: ~127 tokens (vs ~4,900 raw HTML)\n\n# Or step by step:\napitap capture https://polymarket.com    # Watch API traffic\napitap show gamma-api.polymarket.com     # See what was captured\napitap replay gamma-api.polymarket.com get-events  # Call the API directly\n```\n\nNo scraping. No browser. Just the API.\n\n![ApiTap demo](https://raw.githubusercontent.com/n1byn1kt/apitap/main/docs/demo.gif)\n\n---\n\n## How It Works\n\n1. **Capture** — Launch a Playwright browser, visit a site, browse normally. ApiTap intercepts all network traffic via CDP.\n2. **Filter** — Scoring engine separates signal from noise. Analytics, tracking pixels, and framework internals are filtered out. Only real API endpoints survive.\n3. **Generate** — Captured endpoints are grouped by domain, URLs are parameterized (`/users/123` → `/users/:id`), and a JSON skill file is written to `~/.apitap/skills/`.\n4. **Replay** — Read the skill file, substitute parameters, call the API with `fetch()`. Zero dependencies in the replay path.\n\n```\nCapture:  Browser → Playwright listener → Filter → Skill Generator → skill.json\nReplay:   Agent → Replay Engine (skill.json) → fetch() → API → JSON response\n```\n\n## Install\n\n```bash\nnpm install -g @apitap/core\n```\n\n**Claude Code** — one command to wire it up:\n\n```bash\nclaude mcp add -s user apitap -- apitap-mcp\n```\n\nThat's it. 12 MCP tools, ready to go. Requires Node.js 20+.\n\n\u003e **Optional:** To use `capture` and `browse` (which open a real browser), also run:\n\u003e ```bash\n\u003e npx playwright install chromium\n\u003e ```\n\u003e The `read`, `peek`, and `discover` tools work without it.\n\n## Quick Start\n\n### Capture API traffic\n\n```bash\n# Capture from a single domain (default)\napitap capture https://polymarket.com\n\n# Capture all domains (CDN, API subdomains, etc.)\napitap capture https://polymarket.com --all-domains\n\n# Include response previews in the skill file\napitap capture https://polymarket.com --preview\n\n# Stop after 30 seconds\napitap capture https://polymarket.com --duration 30\n```\n\nApiTap opens a browser window. Browse the site normally — click around, scroll, search. Every API call is captured. Press Ctrl+C when done.\n\n### List and explore captured APIs\n\n```bash\n# List all skill files\napitap list\n  ✓ gamma-api.polymarket.com       3 endpoints   2m ago\n  ✓ www.reddit.com                 2 endpoints   1h ago\n\n# Show endpoints for a domain\napitap show gamma-api.polymarket.com\n  [green] ✓ GET    /events                        object (3 fields)\n  [green] ✓ GET    /teams                         array (12 fields)\n\n# Search across all skill files\napitap search polymarket\n```\n\n### Replay an endpoint\n\n```bash\n# Replay with captured defaults\napitap replay gamma-api.polymarket.com get-events\n\n# Override parameters\napitap replay gamma-api.polymarket.com get-events limit=5 offset=10\n\n# Machine-readable JSON output\napitap replay gamma-api.polymarket.com get-events --json\n```\n\n## Text-Mode Browsing\n\nApiTap includes a text-mode browsing pipeline — `peek` and `read` — that lets agents consume web content without launching a browser. Seven built-in decoders extract structured content from popular sites at a fraction of the token cost:\n\n| Site | Decoder | Typical Tokens | vs Raw HTML |\n|------|---------|----------------|-------------|\n| Reddit | `reddit` | ~627 | 93% smaller |\n| YouTube | `youtube` | ~36 | 99% smaller |\n| Wikipedia | `wikipedia` | ~127 | 97% smaller |\n| Hacker News | `hackernews` | ~200 | 90% smaller |\n| Grokipedia | `grokipedia` | ~150–5000+ | varies by article length |\n| Twitter/X | `twitter` | ~80 | 95% smaller |\n| Any other site | `generic` | varies | ~74% avg |\n\n**Average token savings: 74% across 83 tested domains.**\n\n```bash\n# Triage first — zero-cost HEAD request\napitap peek https://reddit.com/r/programming\n  ✓ accessible, recommendation: read\n\n# Extract content — no browser needed\napitap read https://reddit.com/r/programming\n  ✓ Reddit decoder: 12 posts, ~627 tokens\n\n# Works for any URL — falls back to generic HTML extraction\napitap read https://example.com/blog/post\n```\n\nFor MCP agents, `apitap_peek` and `apitap_read` are the fastest way to consume web content — use them before reaching for `apitap_browse` or `apitap_capture`.\n\n## Tested Sites\n\nApiTap has been tested against real-world sites:\n\n| Site | Endpoints | Tier | Replay |\n|------|-----------|------|--------|\n| Polymarket | 3 | Green | 200 |\n| Reddit | 2 | Green | 200 |\n| Discord | 4 | Green | 200 |\n| GitHub | 1 | Green | 200 |\n| HN (Algolia) | 1 | Yellow | 200 |\n| dev.to | 2 | Green | 200 |\n| CoinGecko | 6 | Green | 200 |\n\n78% overall replay success rate across 9 tested sites (green tier: 100%).\n\n## Why ApiTap?\n\n**Why not just use the public API?** Most sites don't have one, or it's heavily rate-limited. The internal API that powers the SPA is often richer, faster, and already handles auth.\n\n**Why not just use Playwright/Puppeteer?** Browser automation costs 50-200K tokens per page for an AI agent. ApiTap captures the API once, then your agent calls it directly at 1-5K tokens. No DOM, no selectors, no flaky waits.\n\n**Why not reverse-engineer the API manually?** You could open DevTools and copy headers by hand. ApiTap does it in 30 seconds and gives you a portable file any agent can use.\n\n**Isn't this just a MITM proxy?** No. ApiTap is read-only — it uses Chrome DevTools Protocol to observe responses. No certificate setup, no request modification, no code injection.\n\n## Replayability Tiers\n\nEvery captured endpoint is classified by replay difficulty:\n\n| Tier | Meaning | Replay |\n|------|---------|--------|\n| **Green** | Public, permissive CORS, no signing | Works with `fetch()` |\n| **Yellow** | Needs auth, no signing/anti-bot | Works with stored credentials |\n| **Orange** | CSRF tokens, session binding | Fragile — may need browser refresh |\n| **Red** | Request signing, anti-bot (Cloudflare) | Needs full browser |\n\nGET endpoints are auto-verified during capture by comparing Playwright responses with raw `fetch()` responses.\n\n## MCP Server\n\nApiTap includes an MCP server with 12 tools for Claude Desktop, Cursor, Windsurf, and other MCP-compatible clients.\n\n```bash\n# Start the MCP server\napitap-mcp\n```\n\n**Claude Code** — see [Install](#install) above.\n\n**Claude Desktop / Cursor / Windsurf** — add to your MCP config:\n\n```json\n{\n  \"mcpServers\": {\n    \"apitap\": {\n      \"command\": \"apitap-mcp\"\n    }\n  }\n}\n```\n\n**VS Code (GitHub Copilot)** — add `.vscode/mcp.json`:\n\n```json\n{\n  \"servers\": {\n    \"apitap\": {\n      \"command\": \"apitap-mcp\"\n    }\n  }\n}\n```\n\n### MCP Tools\n\n| Tool | Description |\n|------|-------------|\n| `apitap_browse` | High-level \"just get me the data\" (discover + replay in one call) |\n| `apitap_peek` | Zero-cost URL triage (HEAD only) |\n| `apitap_read` | Extract content without a browser (7 decoders) |\n| `apitap_discover` | Detect a site's APIs without launching a browser |\n| `apitap_search` | Search available skill files |\n| `apitap_replay` | Replay a captured API endpoint |\n| `apitap_replay_batch` | Replay multiple endpoints in parallel across domains |\n| `apitap_capture` | Capture API traffic via instrumented browser |\n| `apitap_capture_start` | Start an interactive capture session |\n| `apitap_capture_interact` | Interact with a live capture session (click, type, scroll) |\n| `apitap_capture_finish` | Finish or abort a capture session |\n| `apitap_auth_request` | Request human authentication for a site |\n\nYou can also serve a single skill file as a dedicated MCP server with `apitap serve \u003cdomain\u003e` — each endpoint becomes its own tool.\n\n## Chrome Extension\n\nApiTap includes a Chrome extension that captures API traffic directly from your already-logged-in browser — no Playwright, no auth dance, no browser popups.\n\n**Why use the extension?**\n- You're already logged into Spotify, Discord, Reddit — the extension captures from your live session\n- No `apitap auth request` needed — real tokens are captured automatically\n- Browse naturally while it records in the background\n\n### Setup\n\n1. Build the extension:\n```bash\ncd extension \u0026\u0026 npm install \u0026\u0026 npm run build\n```\n\n2. Load in Chrome: `chrome://extensions` → Enable Developer mode → Load unpacked → select the `extension/` folder\n\n3. Wire up auto-save (one-time):\n```bash\napitap extension install --extension-id \u003cyour-extension-id\u003e\n```\nFind your extension ID at `chrome://extensions` (enable Developer mode).\n\n### Usage\n\n1. Click the ApiTap icon in Chrome → **Start Capture**\n2. Browse normally — extension records all API traffic\n3. Click **Stop** → skill files auto-save to `~/.apitap/skills/`\n\nThe popup shows CLI connection status and live capture stats. Auth tokens are automatically stored to `~/.apitap/auth.enc` with `[stored]` placeholders in the exported skill files.\n\n\u003e **Note:** Chrome Web Store submission coming soon. For now, load as an unpacked extension in Developer mode.\n\n---\n\n## Auth Management\n\nApiTap automatically detects and stores auth credentials (Bearer tokens, API keys, cookies) during capture. Credentials are encrypted at rest with AES-256-GCM.\n\n```bash\n# View auth status\napitap auth api.example.com\n\n# List all domains with stored auth\napitap auth --list\n\n# Refresh expired tokens via browser\napitap refresh api.example.com\n\n# Force fresh token before replay\napitap replay api.example.com get-data --fresh\n\n# Clear stored auth\napitap auth api.example.com --clear\n```\n\n## Skill Files\n\nSkill files are JSON documents stored at `~/.apitap/skills/\u003cdomain\u003e.json`. They contain everything needed to replay an API — endpoints, headers, query params, request bodies, pagination patterns, and response shapes.\n\n```json\n{\n  \"version\": \"1.1\",\n  \"domain\": \"gamma-api.polymarket.com\",\n  \"baseUrl\": \"https://gamma-api.polymarket.com\",\n  \"endpoints\": [\n    {\n      \"id\": \"get-events\",\n      \"method\": \"GET\",\n      \"path\": \"/events\",\n      \"queryParams\": { \"limit\": { \"type\": \"string\", \"example\": \"10\" } },\n      \"headers\": {},\n      \"responseShape\": { \"type\": \"object\", \"fields\": [\"id\", \"title\", \"slug\"] }\n    }\n  ]\n}\n```\n\nSkill files are portable and shareable. Auth credentials are stored separately in encrypted storage — never in the skill file itself.\n\n### Import / Export\n\n```bash\n# Import a skill file from someone else\napitap import ./reddit-skills.json\n\n# Import validates: signature check → SSRF scan → confirmation\n```\n\nImported files are re-signed with your local key and marked with `imported` provenance.\n\n## Security\n\nApiTap handles untrusted skill files from the internet and replays HTTP requests on your behalf. That's a high-trust position, and we treat it seriously.\n\n### Defense in Depth\n\n- **Auth encryption** — AES-256-GCM with PBKDF2 key derivation, keyed to your machine\n- **PII scrubbing** — Emails, phones, IPs, credit cards, SSNs detected and redacted during capture\n- **SSRF protection** — Multi-layer URL validation blocks access to internal networks (see below)\n- **Header injection protection** — Allowlist prevents skill files from injecting dangerous HTTP headers (`Host`, `X-Forwarded-For`, `Cookie`, `Authorization`)\n- **Redirect validation** — Manual redirect handling with SSRF re-check prevents redirect-to-internal-IP attacks\n- **DNS rebinding prevention** — Resolved IPs are pinned to prevent TOCTOU attacks where DNS returns different IPs on second lookup\n- **Skill signing** — HMAC-SHA256 signatures detect tampering; three-state provenance tracking (self/imported/unsigned)\n- **No phone-home** — Everything runs locally. No external services, no telemetry\n- **Read-only capture** — Playwright intercepts responses only. No request modification or code injection\n\n### Why SSRF Protection Matters\n\nSince skill files can come from anywhere — shared by colleagues, downloaded from GitHub, or imported from untrusted sources — a malicious skill file is the primary threat vector. Here's what ApiTap defends against:\n\n**The attack:** An attacker crafts a skill file with `baseUrl: \"http://169.254.169.254\"` (the AWS/cloud metadata endpoint) or `baseUrl: \"http://localhost:8080\"` (your internal services). When you replay an endpoint, your machine makes the request, potentially leaking cloud credentials or hitting internal APIs.\n\n**The defense:** ApiTap validates every URL at multiple points:\n\n```\nSkill file imported\n  → validateUrl(): block private IPs, internal hostnames, non-HTTP schemes\n  → validateSkillFileUrls(): scan baseUrl + all endpoint example URLs\n\nEndpoint replayed\n  → resolveAndValidateUrl(): DNS lookup + verify resolved IP isn't private\n  → IP pinning: fetch uses resolved IP directly (prevents DNS rebinding)\n  → Header filtering: strip dangerous headers from skill file\n  → Redirect check: if server redirects, validate new target before following\n```\n\n**Blocked ranges:** `127.0.0.0/8`, `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`, `169.254.0.0/16` (cloud metadata), `0.0.0.0`, IPv6 equivalents (`::1`, `fe80::/10`, `fc00::/7`, `::ffff:` mapped addresses), `localhost`, `.local`, `.internal`, `file://`, `javascript:` schemes.\n\nThis is especially relevant now that [MCP servers are being used as attack vectors in the wild](https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use) — Google's Threat Intelligence Group recently documented underground toolkits built on compromised MCP servers. ApiTap is designed to be safe even when processing untrusted inputs.\n\n\n\n## CLI Reference\n\nAll commands support `--json` for machine-readable output.\n\n| Command | Description |\n|---------|-------------|\n| `apitap browse \u003curl\u003e` | Discover + replay in one step |\n| `apitap peek \u003curl\u003e` | Zero-cost URL triage (HEAD only) |\n| `apitap read \u003curl\u003e` | Extract content without a browser |\n| `apitap discover \u003curl\u003e` | Detect APIs without launching a browser |\n| `apitap capture \u003curl\u003e` | Capture API traffic from a website |\n| `apitap list` | List available skill files |\n| `apitap show \u003cdomain\u003e` | Show endpoints for a domain |\n| `apitap search \u003cquery\u003e` | Search skill files by domain or endpoint |\n| `apitap replay \u003cdomain\u003e \u003cid\u003e [key=val...]` | Replay an API endpoint |\n| `apitap import \u003cfile\u003e` | Import a skill file with safety validation |\n| `apitap refresh \u003cdomain\u003e` | Refresh auth tokens via browser |\n| `apitap auth [domain]` | View or manage stored auth |\n| `apitap serve \u003cdomain\u003e` | Serve a skill file as an MCP server |\n| `apitap inspect \u003curl\u003e` | Discover APIs without saving |\n| `apitap stats` | Show token savings report |\n| `apitap audit` | Audit stored skill files and credentials |\n| `apitap forget \u003cdomain\u003e` | Remove skill file and credentials for a domain |\n| `apitap --version` | Print version |\n\n### Capture flags\n\n| Flag | Description |\n|------|-------------|\n| `--all-domains` | Capture traffic from all domains (default: target domain only) |\n| `--preview` | Include response data previews |\n| `--duration \u003csec\u003e` | Stop capture after N seconds |\n| `--port \u003cport\u003e` | Connect to specific CDP port |\n| `--launch` | Always launch a new browser |\n| `--attach` | Only attach to existing browser |\n| `--no-scrub` | Disable PII scrubbing |\n| `--no-verify` | Skip auto-verification of GET endpoints |\n\n## Development\n\n```bash\ngit clone https://github.com/n1byn1kt/apitap.git\ncd apitap\nnpm install\nnpm test          # 1051 tests, Node built-in test runner\nnpm run typecheck # Type checking\nnpm run build     # Compile to dist/\nnpx tsx src/cli.ts capture \u003curl\u003e  # Run from source\n```\n\n## Contact\n\nQuestions, feedback, or issues? → **[hello@apitap.io](mailto:hello@apitap.io)**\n\n## License\n\n[Business Source License 1.1](./LICENSE) — **free for all non-competing use** (personal, internal, educational, research, open source). Cannot be rebranded and sold as a competing service. Converts to Apache 2.0 on February 7, 2029.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fn1byn1kt%2Fapitap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fn1byn1kt%2Fapitap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fn1byn1kt%2Fapitap/lists"}