{"id":13455531,"url":"https://github.com/ssssssss-team/spider-flow","last_synced_at":"2025-05-14T00:08:21.322Z","repository":{"id":37130587,"uuid":"250512612","full_name":"ssssssss-team/spider-flow","owner":"ssssssss-team","description":"新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。","archived":false,"fork":false,"pushed_at":"2023-06-14T22:27:23.000Z","size":3382,"stargazers_count":9902,"open_issues_count":20,"forks_count":1908,"subscribers_count":93,"default_branch":"master","last_synced_at":"2025-04-10T12:41:15.475Z","etag":null,"topics":["crawler","jsoup","spider","spider-flow","web-crawler","web-spider","webcrawler","webspider","xpath"],"latest_commit_sha":null,"homepage":"https://www.spiderflow.org","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ssssssss-team.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-03-27T11:07:15.000Z","updated_at":"2025-04-10T01:06:37.000Z","dependencies_parsed_at":"2024-01-18T18:25:36.759Z","dependency_job_id":"c59a5ec1-e104-4f7a-9710-953e43025ebc","html_url":"https://github.com/ssssssss-team/spider-flow","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssssssss-team%2Fspider-flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssssssss-team%2Fspider-flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssssssss-team%2Fspider-flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssssssss-team%2Fspider-flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ssssssss-team","download_url":"https://codeload.github.com/ssssssss-team/spider-flow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254043995,"owners_count":22005053,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","jsoup","spider","spider-flow","web-crawler","web-spider","webcrawler","webspider","xpath"],"created_at":"2024-07-31T08:01:06.806Z","updated_at":"2025-05-14T00:08:21.281Z","avatar_url":"https://github.com/ssssssss-team.png","language":"Java","funding_links":[],"categories":["Java","All","置顶"],"sub_categories":["1、AI应用生态"],"readme":"\u003cp align=\"center\"\u003e\r\n    \u003cimg src=\"https://www.spiderflow.org/images/logo.svg\" width=\"600\"\u003e\r\n\u003c/p\u003e\r\n\u003cp align=\"center\"\u003e\r\n    \u003ca target=\"_blank\" href=\"https://www.oracle.com/technetwork/java/javase/downloads/index.html\"\u003e\u003cimg src=\"https://img.shields.io/badge/JDK-1.8+-green.svg\" /\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href=\"https://www.spiderflow.org\"\u003e\u003cimg src=\"https://img.shields.io/badge/Docs-latest-blue.svg\"/\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href=\"https://github.com/ssssssss-team/spider-flow/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/ssssssss-team/spider-flow?logo=github\"\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href='https://gitee.com/ssssssss-team/spider-flow'\u003e\u003cimg src=\"https://gitee.com/ssssssss-team/spider-flow/badge/star.svg?theme=white\" /\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href='https://github.com/ssssssss-team/spider-flow'\u003e\u003cimg src=\"https://img.shields.io/github/stars/ssssssss-team/spider-flow.svg?style=social\"/\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/:license-MIT-blue.svg\"\u003e\u003c/a\u003e\r\n    \u003ca target=\"_blank\" href=\"https://shang.qq.com/wpa/qunwpa?idkey=10faa4cf9743e0aa379a72f2ad12a9e576c81462742143c8f3391b52e8c3ed8d\"\u003e\u003cimg src=\"https://img.shields.io/badge/Join-QQGroup-blue\"\u003e\u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n[介绍](#介绍) | [特性](#特性) | [插件](#插件) | \u003ca target=\"_blank\" href=\"http://demo.spiderflow.org\"\u003eDEMO站点\u003c/a\u003e | \u003ca target=\"_blank\" href=\"https://www.spiderflow.org\"\u003e文档\u003c/a\u003e | \u003ca target=\"_blank\" href=\"https://www.spiderflow.org/changelog.html\"\u003e更新日志\u003c/a\u003e | [截图](#项目部分截图) | [其它开源](#其它开源项目) | [免责声明](#免责声明)\r\n\r\n## 介绍\r\n平台以流程图的方式定义爬虫,是一个高度灵活可配置的爬虫平台\r\n\r\n## 特性\r\n- [x] 支持Xpath/JsonPath/css选择器/正则提取/混搭提取\r\n- [x] 支持JSON/XML/二进制格式\r\n- [x] 支持多数据源、SQL select/selectInt/selectOne/insert/update/delete\r\n- [x] 支持爬取JS动态渲染(或ajax)的页面\r\n- [x] 支持代理\r\n- [x] 支持自动保存至数据库/文件\r\n- [x] 常用字符串、日期、文件、加解密等函数\r\n- [x] 支持插件扩展(自定义执行器，自定义方法）\r\n- [x] 任务监控,任务日志\r\n- [x] 支持HTTP接口\r\n- [x] 支持Cookie自动管理\r\n- [x] 支持自定义函数\r\n\r\n## 插件\r\n- [x] [Selenium插件](https://gitee.com/ssssssss-team/spider-flow-selenium)\r\n- [x] [Redis插件](https://gitee.com/ssssssss-team/spider-flow-redis)\r\n- [x] [OSS插件](https://gitee.com/ssssssss-team/spider-flow-oss)\r\n- [x] [Mongodb插件](https://gitee.com/ssssssss-team/spider-flow-mongodb)\r\n- [x] [IP代理池插件](https://gitee.com/ssssssss-team/spider-flow-proxypool)\r\n- [x] [OCR识别插件](https://gitee.com/ssssssss-team/spider-flow-ocr)\r\n- [x] [电子邮箱插件](https://gitee.com/ssssssss-team/spider-flow-mailbox)\r\n\r\n## 项目部分截图\r\n### 爬虫列表\r\n![爬虫列表](https://images.gitee.com/uploads/images/2020/0412/104521_e1eb3fbb_297689.png \"list.png\")\r\n### 爬虫测试\r\n![爬虫测试](https://images.gitee.com/uploads/images/2020/0412/104659_b06dfbf0_297689.gif \"test.gif\")\r\n### Debug\r\n![Debug](https://images.gitee.com/uploads/images/2020/0412/104741_f9e1190e_297689.png \"debug.png\")\r\n### 日志\r\n![日志](https://images.gitee.com/uploads/images/2020/0412/104800_a757f569_297689.png \"logo.png\")\r\n\r\n## 其它开源项目\r\n- [spider-flow-vue，spider-flow的前端](https://gitee.com/ssssssss-team/spider-flow-vue)\r\n- [magic-api，一个以XML为基础自动映射为HTTP接口的框架](https://gitee.com/ssssssss-team/magic-api)\r\n- [magic-api-spring-boot-starter](https://gitee.com/ssssssss-team/magic-api-spring-boot-starter)\r\n\r\n\r\n## 免责声明\r\n请勿将`spider-flow`应用到任何可能会违反法律规定和道德约束的工作中，请友善使用`spider-flow`，遵守蜘蛛协议，不要将`spider-flow`用于任何非法用途。如您选择使用`spider-flow`即代表您遵守此协议，作者不承担任何由于您违反此协议带来任何的法律风险和损失，一切后果由您承担。\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fssssssss-team%2Fspider-flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fssssssss-team%2Fspider-flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fssssssss-team%2Fspider-flow/lists"}