{"id":48832433,"url":"https://github.com/scrapeless-ai/webunlocker-skill","last_synced_at":"2026-04-14T21:02:51.119Z","repository":{"id":345191009,"uuid":"1184057826","full_name":"scrapeless-ai/webunlocker-skill","owner":"scrapeless-ai","description":"OpenClaw skill of Scrapeless Web Unlocker for Web Scraping, reCAPTCHA \u0026 Cloudflare Solving, and AI Data Collection.","archived":false,"fork":false,"pushed_at":"2026-03-18T02:16:59.000Z","size":19,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-18T18:39:25.907Z","etag":null,"topics":["ai-agent","ai-agents","ai-search","chatgpt-scraper","cloudflare-solver","llm-scraping","openclaw","openclaw-skills","perplexity","scraping-api","web-scraping","web-unblocker"],"latest_commit_sha":null,"homepage":"https://www.scrapeless.com/en/product/universal-scraping-api","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scrapeless-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-17T07:57:25.000Z","updated_at":"2026-03-18T02:20:24.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/scrapeless-ai/webunlocker-skill","commit_stats":null,"previous_names":["scrapeless-ai/webunlocker-skill"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/scrapeless-ai/webunlocker-skill","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapeless-ai%2Fwebunlocker-skill","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapeless-ai%2Fwebunlocker-skill/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapeless-ai%2Fwebunlocker-skill/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapeless-ai%2Fwebunlocker-skill/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scrapeless-ai","download_url":"https://codeload.github.com/scrapeless-ai/webunlocker-skill/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapeless-ai%2Fwebunlocker-skill/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31815080,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"ssl_error","status_checked_at":"2026-04-14T18:05:01.765Z","response_time":153,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","ai-agents","ai-search","chatgpt-scraper","cloudflare-solver","llm-scraping","openclaw","openclaw-skills","perplexity","scraping-api","web-scraping","web-unblocker"],"created_at":"2026-04-14T21:02:41.758Z","updated_at":"2026-04-14T21:02:51.111Z","avatar_url":"https://github.com/scrapeless-ai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n[\u003cimg width=\"1200\" height=\"629\" alt=\"20260318-100141\" src=\"https://github.com/user-attachments/assets/3a0f7070-d6ad-4ebe-ab5e-07de5356a46a\" /\u003e](https://www.scrapeless.com/en/product/universal-scraping-api)\n\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eOpenClaw skill of Scrapeless Web Unlocker for Web Scraping, Cloudflare Solving, and AI Data Collection.\u003c/strong\u003e\u003cbr/\u003e\n\u003c/p\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://www.youtube.com/@Scrapeless\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Follow%20on%20YouTuBe-FF0033?style=for-the-badge\u0026logo=youtube\u0026logoColor=white\" alt=\"Follow on YouTuBe\" /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://discord.com/invite/xBcTfGPjCQ\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge\u0026logo=discord\u0026logoColor=white\" alt=\"Join our Discord\" /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://x.com/Scrapelessteam\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Follow%20us%20on%20X-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white\" alt=\"Follow us on X\" /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://www.reddit.com/r/Scrapeless\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Join%20us%20on%20Reddit-FF4500?style=for-the-badge\u0026logo=reddit\u0026logoColor=white\" alt=\"Join us on Reddit\" /\u003e\n    \u003c/a\u003e \n    \u003ca href=\"https://app.scrapeless.com/passport/register?utm_source=official\u0026utm_term=githubopen\" target=\"_blank\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Official%20Website-12A594?style=for-the-badge\u0026logo=google-chrome\u0026logoColor=white\" alt=\"Official Website\"/\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n\n---\n\n# 🤖 Scrapeless Openclaw WebUnlocker Skill\n\nA skill for the Scrapeless platform that enables you to solve website blocks and scrape web content using the Scrapeless Universal Scraping API. It supports JavaScript rendering, CAPTCHA solving, IP rotation, and intelligent request retries.\n\n## Overview\n\nThe **Web Unlocker Skill** allows developers and AI agents to **access and extract data from websites that normally block automated traffic**.\nBuilt on top of the **Scrapeless Universal Scraping API**, this skill automatically handles common bot protections such as **Cloudflare, CAPTCHA challenges, IP blocking, and JavaScript rendering**, making it easy to retrieve clean web data from difficult targets.\nInstead of managing proxy pools, headless browsers, and bypass logic yourself, Web Unlocker provides a **simple API interface to reliably fetch web pages at scale**.\nThis makes it ideal for **web scraping, data pipelines, AI training datasets, market intelligence, and automation workflows**.\n\n## ❓ Why Use Web Unlocker\n\nModern websites deploy increasingly sophisticated bot detection systems such as:\n\n- Cloudflare protection  \n- CAPTCHA challenges  \n- Browser fingerprint detection  \n- IP reputation blocking  \n- JavaScript-rendered content  \n\nTraditional scraping tools or headless browsers often fail against these protections.\n\n**Web Unlocker solves this by combining:**\n\n- Stealth browser infrastructure  \n- Proxy rotation  \n- CAPTCHA solving  \n- Intelligent retry mechanisms  \n\n👉 Developers only need to send a request — the platform handles the rest.\n\n---\n\n## ✨ Key Features\n\n**🤖 Automatic CAPTCHA Solving**\n- Supports reCAPTCHA, Cloudflare Turnstile and Cloudflare challenge pages. \n\n**🌐 JavaScript Rendering**\n- Execute full browser rendering for modern frameworks such as **React, Next.js, and Vue**.\n\n**🌍 Global Proxy Infrastructure**\n- Built-in proxy rotation and country selection for higher success rates and geo-targeted scraping.\n\n**📦 Multiple Response Formats**\n- Retrieve data in various formats:\n\n  - HTML  \n  - Plain text  \n  - Markdown  \n  - Screenshots (PNG / JPEG)  \n  - Network requests  \n  - Structured extracted content  \n\n**🔁 Intelligent Retry System**\n- Automatically retries failed requests using optimized routing.\n\n---\n\n## 🧩 Use Cases\n\n**📊 Web Scraping \u0026 Data Extraction**\n- Collect structured data from e-commerce, search engines, directories, and public websites.  \n\n**🤖 AI Training Data Collection**\n- Gather high-quality datasets for LLM training, AI evaluation, or synthetic data generation.  \n\n**📈 Market Intelligence**\n- Monitor competitors, pricing data, product catalogs, and industry signals. \n\n**🔍 SEO \u0026 AI Search Monitoring**\n- Track how websites appear across search engines and AI-powered search platforms. \n\n**⚙️ Automation \u0026 AI Agents**\n- Integrate web data directly into **AI agents, workflows, or automation platforms.\n\n---\n\n## Installation\n\n1. Clone the repository:\n\n```bash\ngit clone https://github.com/scrapeless-ai/webunlocker-skill.git\n```\n\n2. Install dependencies for WebUnlocker:\n\n```bash\ncd webunlocker-skill\npip install -r requirements.txt\n```\n\n## ⚙️ Environment Configuration\n\n1. **Manual installation**: Place the skill in OpenClaw’s `.openclaw/skills` directory.\n  \n2. Create a `.env` file in the root directory based on the `.env.example` file:\n\n```bash\ncp .env.example .env\n```\n\n3. Add your Scrapeless API token to the `.env` file:\n\n```\nX_API_TOKEN=your_api_token_here\n```\n\nYou can obtain an API token from the [Scrapeless website](https://www.scrapeless.com).\n\n## Usage Examples\n\n```bash\n# Scrape HTML content\npython3 scripts/webunlocker.py --url \"https://httpbin.io/get\"\n\n# Scrape as Markdown\npython3 scripts/webunlocker.py --url \"https://example.com\" --response-type markdown\n\n# Take a screenshot\npython3 scripts/webunlocker.py --url \"https://example.com\" --response-type png\n\n# Extract specific content types\npython3 scripts/webunlocker.py --url \"https://example.com\" --response-type content --content-types emails,links,images\n\n# Use a US proxy\npython3 scripts/webunlocker.py --url \"https://example.com\" --country US\n\n# Use POST method\npython3 scripts/webunlocker.py --url \"https://httpbin.org/post\" --method POST --data '{\"key\": \"value\"}'\n\n# Add custom headers\npython3 scripts/webunlocker.py --url \"https://example.com\" --headers '{\"User-Agent\": \"Mozilla/5.0\"}'\n\n# Use custom proxy\npython3 scripts/webunlocker.py --url \"https://example.com\" --proxy-url \"http://your-proxy-url:port\"\n\n# Enable JavaScript rendering\npython3 scripts/webunlocker.py --url \"https://example.com\" --js-render\n\n# Bypass Cloudflare Turnstile challenge\npython3 scripts/webunlocker.py --url \"https://2captcha.com/demo/cloudflare-turnstile-challenge\" --js-render --headless --response-type markdown\n```\n\n## Output Structure\nWeb Unlocker supports multiple response formats depending on your needs.\n\n| Response Type | Description |\n|--------------|------------|\n| HTML | Full rendered page HTML |\n| Plaintext | Clean text without HTML tags |\n| Markdown | Structured Markdown content |\n| PNG / JPEG | Page screenshots |\n| Network | All network requests during page load |\n| Content | Extract specific data types such as emails, links, or images |\n\n## Common Issues\n\n### Rate Limits\nIf you encounter 429 errors, you've exceeded the rate limit. Reduce request frequency or upgrade your Scrapeless plan.\n\n### Timeouts\n- Page load timeout: 30 seconds\n- Global execution timeout: 180 seconds\n\n### CAPTCHA Solving\nWebUnlocker automatically handles reCaptcha V2, Cloudflare Turnstile, and Cloudflare Challenge, but complex CAPTCHAs may occasionally fail.\n\n### Billing\n- Charges are applied on a per-request basis\n- Only successful requests will be billed\n\n## 🔗 Related resources\n\n- [Scrapeless LLM Scraper](https://docs.scrapeless.com/en/llm-chat-scraper/quickstart/introduction/)\n- [Scrapeless Universal Scraping API](https://docs.scrapeless.com/en/universal-scraping-api/)\n\n## 📬 Contact Us\nFor questions, suggestions, or collaboration inquiries, feel free to contact us via:\n- Email/Slack: market@scrapeless.com\n- Official Website: https://www.scrapeless.com\n- Community Forum: [Browser Labs Discord](https://discord.com/invite/xBcTfGPjCQ)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapeless-ai%2Fwebunlocker-skill","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscrapeless-ai%2Fwebunlocker-skill","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapeless-ai%2Fwebunlocker-skill/lists"}