{"id":20560279,"url":"https://github.com/shaoxiongdu/skyeye","last_synced_at":"2025-07-31T13:12:56.597Z","repository":{"id":176853626,"uuid":"656722318","full_name":"shaoxiongdu/SkyEye","owner":"shaoxiongdu","description":"一个基于SpringBoot的全网热点爬虫项目，原始热搜数据会入库，分词统计会存入Redis。方便之后的数据分析。","archived":false,"fork":false,"pushed_at":"2023-07-06T03:39:55.000Z","size":1073,"stargazers_count":17,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-07T18:44:10.565Z","etag":null,"topics":["crawler","crawlers","mysql","redis","spring","spring-boot"],"latest_commit_sha":null,"homepage":"http://web.shaoxiongdu.cn","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shaoxiongdu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-21T13:59:17.000Z","updated_at":"2025-02-26T07:33:35.000Z","dependencies_parsed_at":null,"dependency_job_id":"0bbbea01-ace5-488e-bd59-025300856682","html_url":"https://github.com/shaoxiongdu/SkyEye","commit_stats":null,"previous_names":["shaoxiongdu/skyeye"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/shaoxiongdu/SkyEye","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaoxiongdu%2FSkyEye","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaoxiongdu%2FSkyEye/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaoxiongdu%2FSkyEye/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaoxiongdu%2FSkyEye/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shaoxiongdu","download_url":"https://codeload.github.com/shaoxiongdu/SkyEye/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shaoxiongdu%2FSkyEye/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268045219,"owners_count":24186753,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-31T02:00:08.723Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","crawlers","mysql","redis","spring","spring-boot"],"created_at":"2024-11-16T03:54:06.557Z","updated_at":"2025-07-31T13:12:56.588Z","avatar_url":"https://github.com/shaoxiongdu.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch3 align=\"center\"\u003eSkyEyeSystem\u003c/h3\u003e\n\n  \u003cp align=\"center\"\u003e\n     一个基于SpringBoot的全网热点爬虫项目\n    \u003cbr /\u003e\n    \u003ca href=\"./README.md\"\u003e中文\u003c/a\u003e\n    ·\n    \u003ca href=\"./README_en.md\"\u003eEnglish\u003c/a\u003e\n  \u003c/p\u003e\n\u003cdetails open=\"open\"\u003e\n  \u003csummary\u003e目录\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e\n      \u003ca href=\"#关于项目\"\u003e关于项目\u003c/a\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#快速启动\"\u003e快速启动\u003c/a\u003e\n      \u003cul\u003e\n        \u003cli\u003e\u003ca href=\"#先决条件\"\u003e先决条件\u003c/a\u003e\u003c/li\u003e\n        \u003cli\u003e\u003ca href=\"#安装\"\u003e安装\u003c/a\u003e\u003c/li\u003e\n      \u003c/ul\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#使用\"\u003e使用\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#贡献\"\u003e贡献\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#许可证\"\u003e许可证\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#联系\"\u003e联系\u003c/a\u003e\u003c/li\u003e\n  \u003c/ol\u003e\n\u003c/details\u003e\n\n## 关于项目\n\n![image-20230705153710250](https://images-1301128659.cos.ap-beijing.myqcloud.com/shaoxiongdu/202307051537338.png)\n\n每天下午三点定时爬取全网热搜数据。包括\n\n- 微博热搜\n- B站热搜\n- CSDN热搜\n- 知乎热搜\n- 今日头条\n- 百度热搜\n- 掘金\n- 36氪\n- 腾讯新闻\n- 少数派\n\n爬取数据之后\n\n1. 会将原始数据存入MySQL。\n2. 进行词频统计 存入Redis。\n\n## 快速启动\n\n此处说明了如何快速的使用本项目\n\n### 先决条件\n\n确保您的安装器是Maven\n\n### 安装\n\n1. maven sync\n2. 初始化数据库 [SQL脚本](src/main/resources/db/ddl.sql)\n3. application中配置您的数据库地址\n4. [config](src/main/resources/config/redis.setting)中配置redis地址。\n5. 启动即可。\n\n## 使用\n\n#### 1. 手动执行爬虫操作\n\n发送get请求 /api/hotspot/crawler即可。\n\n#### 2. 配置爬虫的执行时间\n\n修改[爬虫任务](src/main/java/cn/shoxiongdu/SkyEyeSystem/task/hotspot/crawl/CrawlerTask.java)中的注解值即可。\n接收标准的CRON参数。 可使用[Cron在线表达式生成器](http://cron.ciding.cc/) 在线生成\n\n```java\n\npublic class CrawlerTask {\n\n   @Scheduled(cron = \"0 */10 9-23 * * *\") // 每天的 9 点到 23 点之间，每隔十分钟执行一次任务。\n   public void crawl() {\n      // ...\n   }\n\n}\n```\n\n#### 3. 新增爬虫数据的平台实现\n\n1. 首先在 平台表 hot_platform 中新增对应对平台记录。举例如下。\n\n   ``` sql\n   INSERT INTO sky_eye_system.hot_platform \n   VALUES (2, \n           '微博',\n           'https://ts3.cn.mm.bing.net/th?id=ODLS.05d45f55-2151-4d66-83e5-d10018607094\u0026w=32\u0026h=32\u0026qlt=90\u0026pcl=fffffa\u0026o=6\u0026pid=1.2',\n           '随时随地发现新鲜事！微博带你欣赏世界上每一个精彩瞬间，了解每一个幕后故事。分享你想表达的，让全世界都能听到你的心声！',\n           'https://weibo.com', \n           '随时随地发现新鲜事！', \n           '王志东', \n           null, \n           null, \n           0);\n   ```\n\n2. 在 [src/main/java/cn/shoxiongdu/SkyEyeSystem/task/hotspot/crawl/impl]\n   下新增对应的平台类，并实现接口 [HotDataCrawler](src/main/java/cn/shoxiongdu/SkyEyeSystem/task/hotspot/crawl/HotDataCrawler.java)\n\n   ``` java\n   \n   public class XXXCrawler implements HotDataCrawler {\n       \n      // 平台表中的id \n       private static final Long PLATFORM_ID = ${platformId};\n       \n       private PlatformMapper platformMapper;\n       \n       @Override\n       public List\u003cHotSpot\u003e crawlHotSpotData() {\n           // 执行自定义爬虫逻辑 返回的HotSpot列表。\n           return hotSpotList;\n       }\n       \n       @Override\n       public Platform getPlatform() {\n           return platformMapper.selectById(PLATFORM_ID);\n       }\n   }\n   \n   ```\n\n3. 实现crawlHotSpotData方法，执行自定义的数据爬取逻辑，将爬取的数据封装为HotSpot的List并返回。\n\n4. 将常量PLATFORM_ID的值改为您的对应的平台表中的id。\n\n5. 将实现类添加到Spring容器中。( @Component/@Service )\n\n6. 完成。此时，定时任务会执行您的爬取逻辑并入库。同时首页会展示相对应的数据。\n\n## 贡献\n\n贡献使开源社区成为一个学习、激励和创造的绝佳场所。非常感谢您所做的任何贡献。\n\n1.fork项目\n\n2.创建功能分支\n\n3.提交更改\n\n4.推送至分支\n\n5.打开拉取请求\n\n## 许可证\n\n基于MIT的许可证分发，传输请遵循相关开源协议: [MIT许可证](LICENSE )\n\n## 联系\n\n- 杜少雄 email@shaoxiongdu.cn\n- 微信: 15603430511\n- 个人博客: https://shaoxiongdu.cn\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshaoxiongdu%2Fskyeye","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshaoxiongdu%2Fskyeye","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshaoxiongdu%2Fskyeye/lists"}