{"id":47782810,"url":"https://github.com/ma-pony/deepspider","last_synced_at":"2026-04-03T13:51:48.813Z","repository":{"id":334663039,"uuid":"1142234916","full_name":"ma-pony/deepspider","owner":"ma-pony","description":"智能爬虫工程平台 - 基于 DeepAgents + Patchright 的 AI 爬虫 Agent | Intelligent Web Scraping Platform - AI-powered Crawler Agent built on DeepAgents + Patchright","archived":false,"fork":false,"pushed_at":"2026-03-02T10:44:37.000Z","size":1608,"stargazers_count":1,"open_issues_count":2,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-02T14:29:57.932Z","etag":null,"topics":["ai-agent","anti-detect","automation","captcha","crawler","javascript","reverse-engineering","web-scraping"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ma-pony.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-01-26T06:09:33.000Z","updated_at":"2026-03-02T10:41:45.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ma-pony/deepspider","commit_stats":null,"previous_names":["ma-pony/jsforge"],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/ma-pony/deepspider","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma-pony%2Fdeepspider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma-pony%2Fdeepspider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma-pony%2Fdeepspider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma-pony%2Fdeepspider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ma-pony","download_url":"https://codeload.github.com/ma-pony/deepspider/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma-pony%2Fdeepspider/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31355353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-03T08:03:20.796Z","status":"ssl_error","status_checked_at":"2026-04-03T08:00:37.834Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","anti-detect","automation","captcha","crawler","javascript","reverse-engineering","web-scraping"],"created_at":"2026-04-03T13:51:48.177Z","updated_at":"2026-04-03T13:51:48.806Z","avatar_url":"https://github.com/ma-pony.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DeepSpider\n\n[![npm version](https://img.shields.io/npm/v/deepspider.svg)](https://www.npmjs.com/package/deepspider)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n\u003e AI 原生的智能反爬平台 - 把 3 天的逆向分析工作压缩到 10 分钟\n\n[English](README_EN.md)\n\n## 核心特性\n\n**AI First 架构** - AI 为核心，工具为辅助\n- 直接理解混淆代码（无需反混淆预处理）\n- 识别加密算法，正则 hints 辅助 LLM 分析\n- 生成可运行代码（Python/JS）\n- 统一模型配置，用户自选本地或云端 LLM\n\n**完整反爬能力**\n- 逆向分析：AI 理解 JS 源码，生成 Python 实现\n- 验证码处理：OCR、滑块、点选\n- 反检测：指纹伪装、代理轮换\n- 爬虫编排：AI 生成完整项目\n\n**真实浏览器 + CDP**\n- Patchright 反检测浏览器\n- CDP 深度集成（Hook、断点、拦截）\n- 浏览器内置分析面板\n- 实时数据采集（零 API 成本）\n\n## 快速开始\n\n### 安装\n\n```bash\nnpm install -g deepspider\n```\n\n### 配置\n\n```bash\ndeepspider config set apiKey sk-ant-api03-xxx\ndeepspider config set baseUrl https://api.anthropic.com\ndeepspider config set model claude-opus-4-6\n```\n\n### 使用\n\n```bash\n# 分析目标网站\ndeepspider https://example.com\n\n# 快速 HTTP 请求（轻量级）\ndeepspider fetch https://api.example.com\n```\n\n## 使用流程\n\n1. **启动**: `deepspider https://target-site.com`\n2. **等待**: 浏览器打开，自动记录数据\n3. **操作**: 登录、翻页、触发目标请求\n4. **选择**: 点击面板 ⦿ 选择目标数据\n5. **分析**: 选择操作（追踪来源/分析加密/生成爬虫）\n6. **对话**: 继续提问，深入分析\n\n## 架构\n\n```\nAI 原生架构（v2.0）\n\n主 Agent（AI 驱动）\n├── AI 理解层（核心 80%）\n│   ├── 直接理解混淆代码\n│   ├── 识别加密算法\n│   └── 生成 Python 代码\n├── 工具验证层（辅助 15%）\n│   ├── 数据采集（浏览器+CDP）\n│   ├── 动态验证（Hook+调试）\n│   └── 代码执行（沙箱验证）\n└── 能力扩展层（可选 5%）\n    ├── 验证码处理\n    ├── 反检测\n    └── 爬虫编排\n```\n\n## 加密分析\n\n**Hints + LLM 架构**：\n- 34 个正则模式（MD5/SHA/AES/RSA/SM2/SM3/SM4 等）自动提取加密类型 hints\n- Hints 作为辅助信息注入 LLM prompt，提升分析准确率\n- 所有分析由用户配置的 LLM 完成（本地或云端，统一配置）\n- 无中间缓存层，避免缓存投毒导致的误判\n\n## 文档\n\n- [开发使用指南](docs/GUIDE.md)\n- [调试指南](docs/DEBUG.md)\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fma-pony%2Fdeepspider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fma-pony%2Fdeepspider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fma-pony%2Fdeepspider/lists"}