{"id":17311985,"url":"https://github.com/boris-code/feaplat","last_synced_at":"2025-04-13T09:52:03.412Z","repository":{"id":60324386,"uuid":"383193539","full_name":"Boris-code/feaplat","owner":"Boris-code","description":"爬虫管理系统，支持集群，弹性伸缩。支持运行feapder、scrapy、selenium、playwright等各种框架及脚本","archived":false,"fork":false,"pushed_at":"2024-04-10T08:13:28.000Z","size":53,"stargazers_count":84,"open_issues_count":5,"forks_count":21,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-04-10T09:51:20.863Z","etag":null,"topics":["crawler","feapder","feaplat","spider"],"latest_commit_sha":null,"homepage":"https://feapder.com/#/feapder_platform/feaplat","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Boris-code.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2021-07-05T15:54:19.000Z","updated_at":"2024-04-15T11:19:26.049Z","dependencies_parsed_at":"2024-02-28T07:27:48.671Z","dependency_job_id":"e1415834-85d4-4254-9d58-d9e693f62087","html_url":"https://github.com/Boris-code/feaplat","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boris-code%2Ffeaplat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boris-code%2Ffeaplat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boris-code%2Ffeaplat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boris-code%2Ffeaplat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Boris-code","download_url":"https://codeload.github.com/Boris-code/feaplat/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248695301,"owners_count":21146952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","feapder","feaplat","spider"],"created_at":"2024-10-15T12:42:02.016Z","updated_at":"2025-04-13T09:52:03.390Z","avatar_url":"https://github.com/Boris-code.png","language":null,"readme":"# 爬虫管理系统 - FEAPLAT\n\n\u003e 生而为虫，不止于虫\n\n**feaplat**命名源于 feapder 与 platform 的缩写\n\n读音： `[ˈfiːplæt] `\n\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/09/14/16316112326191.jpg)\n\n## 特性\n\n1. 支持任何python脚本，包括不限于`feapder`、`scrapy`\n2. 支持浏览器渲染，支持有头模式。浏览器支持`playwright`、`selenium`\n3. 支持部署服务，可自动负载均衡\n4. 支持服务器集群管理\n5. 支持监控，监控内容可自定义\n6. 支持起多个实例，如分布式爬虫场景\n7. 支持弹性伸缩\n8. 支持4种定时启动方式\n9. 支持自定义worker镜像，如自定义java的运行环境、机器学习环境等，即根据自己的需求自定义（feaplat分为`master-调度端`和`worker-运行任务端`）\n10. docker一键部署，架设在docker swarm集群上\n\n\n## 为什么用feaplat爬虫管理系统\n\n**市面上的爬虫管理系统**\n\n![feapderd](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/07/23/feapderd.png)\n\nworker节点常驻，且运行多个任务，不能弹性伸缩，任务之前会相互影响，稳定性得不到保障\n\n**feaplat爬虫管理系统**\n\n![pic](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/07/23/pic.gif)\n\nworker节点根据任务动态生成，一个worker只运行一个任务实例，任务做完worker销毁，稳定性高；多个服务器间自动均衡分配，弹性伸缩\n\n\n## 功能概览\n\n### 1. 项目管理\n\n添加/编辑项目\n![-w1785](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/07/06/16254968151490.jpg)\n\n### 2. 任务管理\n\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2022/03/03/16463109796998.jpg)\n\n\n### 3. 任务实例\n\n日志\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2022/03/03/16463117042527.jpg)\n\n\n### 4. 爬虫监控\n\nfeaplat支持对feapder爬虫的运行情况进行监控，除了数据监控和请求监控外，用户还可自定义监控内容，详情参考[自定义监控](http://feapder.com/#/source_code/%E7%9B%91%E6%8E%A7%E6%89%93%E7%82%B9?id=%e8%87%aa%e5%ae%9a%e4%b9%89%e7%9b%91%e6%8e%a7)\n\n若scrapy爬虫或其他python脚本使用监控功能，也可通过自定义监控的功能来支持，详情参考[自定义监控](http://feapder.com/#/source_code/%E7%9B%91%E6%8E%A7%E6%89%93%E7%82%B9?id=%e8%87%aa%e5%ae%9a%e4%b9%89%e7%9b%91%e6%8e%a7)\n\n注：需 feapder\u003e=1.6.6\n\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/09/14/16316112326191.jpg)\n\n\n\n## 部署\n\n\u003e 下面部署以centos为例， 其他平台docker安装方式可参考docker官方文档：https://docs.docker.com/compose/install/\n\n### 1. 安装docker\n\n删除旧版本（可选，需要重装升级时执行）\n\n```shell\nyum remove docker  docker-common docker-selinux docker-engine\n```\n\n安装：\n```shell\nyum install -y yum-utils device-mapper-persistent-data lvm2 \u0026\u0026 python2 /usr/bin/yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo \u0026\u0026 yum install docker-ce -y\n```\n国内用户推荐使用\n```shell\nyum install -y yum-utils device-mapper-persistent-data lvm2 \u0026\u0026 python2 /usr/bin/yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo \u0026\u0026 yum install docker-ce -y\n```\n启动\n```shell\nsystemctl enable docker\nsystemctl start docker\n```\n\n### 2. 安装 docker swarm\n    \n    docker swarm init\n    \n    # 如果你的 Docker 主机有多个网卡，拥有多个 IP，必须使用 --advertise-addr 指定 IP\n    docker swarm init --advertise-addr 192.168.99.100\n\n### 3. 安装docker-compose\n\n```shell\nsudo curl -L \"https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)\" -o /usr/local/bin/docker-compose\nsudo chmod +x /usr/local/bin/docker-compose\n```\n国内用户推荐使用\n```shell\nsudo curl -L \"https://get.daocloud.io/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)\" -o /usr/local/bin/docker-compose\nsudo chmod +x /usr/local/bin/docker-compose\n```\n\n### 4. 部署feaplat爬虫管理系统\n#### 预备项\n安装git(1.8.3的版本已够用)\n```shell\nyum -y install git\n```\n#### 1. 下载项目\n\n\u003e 先按照下面命令拉取develop分支代码运行。\n\u003e master分支不支持urllib3\u003e=2.0版本，现在已经运行不起来了，但之前老用户不受影响。待后续测试好兼容性，不影响老用户后，会将develop分支合并到master\n\ngitub\n```shell\ngit clone -b develop https://github.com/Boris-code/feaplat.git\n```\ngitee\n```shell\ngit clone -b develop https://gitee.com/Boris-code/feaplat.git\n```\n\n\n#### 2. 运行 \n\n首次运行需拉取镜像，时间比较久，且运行可能会报错，再次运行下就好了\n\n```shell\ncd feaplat\ndocker-compose up -d\n```\n\n- 若端口冲突，可修改.env文件，参考[常见问题](http://feapder.com/#/feapder_platform/question?id=%e4%bf%ae%e6%94%b9%e7%ab%af%e5%8f%a3)\n\n#### 3. 访问爬虫管理系统\n\n默认地址：`http://localhost`\n默认账密：admin / admin\n\n- 若未成功，参考[常见问题](http://feapder.com/#/feapder_platform/question)\n- 使用说明，参考[使用说明](http://feapder.com/#/feapder_platform/usage)\n\n#### 4. 停止（可选）\n\n```shell\ndocker-compose stop\n```\n\n### 5. 添加服务器（可选）\n\n\u003e 用于搭建集群，扩展爬虫（worker）节点服务器\n\n#### 1. 安装docker\n\n参考部署步骤1\n\n#### 2. 部署\n\n在master服务器（feaplat爬虫管理系统所在服务器）执行下面命令，查看token\n\n```shell\ndocker swarm join-token worker\n```\n\n**在需扩充的服务器上执行**\n\n```shell\ndocker swarm join --token [token] [ip]\n```\n\n这条命令用于将该台服务器加入集群节点\n\n#### 3. 验证是否成功\n\n在master服务器（feaplat爬虫管理系统所在服务器）执行下面命令\n\n```shell\ndocker node ls\n```\n\n若打印结果包含刚加入的服务器，则添加服务器成功\n\n#### 4. 下线服务器（可选）\n\n在需要下线的服务器上执行\n\n```shell\ndocker swarm leave\n```\n\n## 拉取私有项目\n\n拉取私有项目需在git仓库里添加如下公钥\n\n```\nssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCd/k/tjbcMislEunjtYQNXxz5tgEDc/fSvuLHBNUX4PtfmMQ07TuUX2XJIIzLRPaqv3nsMn3+QZrV0xQd545FG1Cq83JJB98ATTW7k5Q0eaWXkvThdFeG5+n85KeVV2W4BpdHHNZ5h9RxBUmVZPpAZacdC6OUSBYTyCblPfX9DvjOk+KfwAZVwpJSkv4YduwoR3DNfXrmK5P+wrYW9z/VHUf0hcfWEnsrrHktCKgohZn9Fe8uS3B5wTNd9GgVrLGRk85ag+CChoqg80DjgFt/IhzMCArqwLyMn7rGG4Iu2Ie0TcdMc0TlRxoBhqrfKkN83cfQ3gDf41tZwp67uM9ZN feapder@qq.com\n```\n\n或在系统设置页面配置您的SSH私钥，然后在git仓库里添加您的公钥，例如：\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/10/19/16346353514967.jpg)\n\n注意，公私钥加密方式为RSA，其他的可能会有问题\n\n生成RSA公私钥方式如下：\n```shell\nssh-keygen -t rsa -C \"备注\" -f 生成路径/文件名\n```\n如：\n`ssh-keygen -t rsa -C \"feaplat\" -f id_rsa`\n然后一路回车，不要输密码\n![](http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/11/17/16371210640228.jpg)\n最终生成 `id_rsa`、`id_rsa.pub` 文件，复制`id_rsa.pub`文件内容到git仓库，复制`id_rsa`文件内容到feaplat爬虫管理系统\n\n## 自定义爬虫镜像\n\n默认的爬虫镜像只打包了`feapder`、`scrapy`框架，若需要其它环境，可基于`.env`文件里的`SPIDER_IMAGE`镜像自行构建\n\n如将常用的python库打包到镜像\n```\nFROM registry.cn-hangzhou.aliyuncs.com/feapderd/feapder:[最新版本号]\n\n# 安装依赖\nRUN pip3 install feapder \\\n    \u0026\u0026 pip3 install scrapy\n\n```\n\n自己随便搞事情，搞完修改下 `.env`文件里的 SPIDER_IMAGE 的值即可\n\n\n## 价格\n\n| 类型   | 价格  | 说明                            |\n|------|-----|-------------------------------|\n| 试用版  | 0元   | 可部署5个任务，删除任务不可恢复额度|\n| 正式版 | 288元 | 有效期一年，可换绑服务器|\n\n**部署后默认为试用版，购买授权码后配置到系统里即为正式版**\n\n购买方式：添加微信 `boris_tm`\n\n随着功能的完善，价格会逐步调整\n\n## 学习交流\n\n\u003ctable border=\"0\"\u003e \n    \u003ctr\u003e \n     \u003ctd\u003e 知识星球：17321694 \u003c/td\u003e \n     \u003ctd\u003e 作者微信： boris_tm \u003c/td\u003e \n     \u003ctd\u003e QQ群号：750614606 \u003c/td\u003e \n    \u003c/tr\u003e \n    \u003ctr\u003e \n    \u003ctd\u003e \u003cimg src=\"http://markdown-media.oss-cn-beijing.aliyuncs.com/2020/02/16/zhi-shi-xing-qiu.jpeg\" width=250px\u003e\n \u003c/td\u003e \n     \u003ctd\u003e \u003cimg src=\"http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/07/12/er-wei-ma.jpeg\" width=\"250px\" /\u003e \u003c/td\u003e \n     \u003ctd\u003e \u003cimg src=\"http://markdown-media.oss-cn-beijing.aliyuncs.com/2021/07/12/16260897330897.jpg\" width=\"250px\" /\u003e \u003c/td\u003e \n    \u003c/tr\u003e \n  \u003c/table\u003e \n  \n  加好友备注：feaplat\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboris-code%2Ffeaplat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fboris-code%2Ffeaplat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboris-code%2Ffeaplat/lists"}