{"id":24773184,"url":"https://github.com/seaung/suspider","last_synced_at":"2026-05-16T17:04:51.846Z","repository":{"id":274663677,"uuid":"595076449","full_name":"seaung/suspider","owner":"seaung","description":"一个基于PyQt5的网站爬虫工具，支持多层级网页抓取和自定义配置。","archived":false,"fork":false,"pushed_at":"2025-01-28T17:01:16.000Z","size":24,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-23T18:38:26.004Z","etag":null,"topics":["crawler-python","qt5","qt5-browswer-spider","url-finder"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seaung.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-30T10:48:03.000Z","updated_at":"2025-10-15T02:57:17.000Z","dependencies_parsed_at":"2025-01-28T17:32:10.835Z","dependency_job_id":"c1e82c12-014a-4eb6-bac3-30abf34dde77","html_url":"https://github.com/seaung/suspider","commit_stats":null,"previous_names":["seaung/suspider"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/seaung/suspider","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seaung%2Fsuspider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seaung%2Fsuspider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seaung%2Fsuspider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seaung%2Fsuspider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seaung","download_url":"https://codeload.github.com/seaung/suspider/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seaung%2Fsuspider/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33111497,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T04:41:52.686Z","status":"ssl_error","status_checked_at":"2026-05-16T04:41:52.009Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler-python","qt5","qt5-browswer-spider","url-finder"],"created_at":"2025-01-29T04:39:30.413Z","updated_at":"2026-05-16T17:04:51.813Z","avatar_url":"https://github.com/seaung.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Suspider 网站爬虫工具\n\n一个基于PyQt5的网站爬虫工具，支持多层级网页抓取和自定义配置。\n\n## 功能特点\n\n- 支持多层级网页抓取\n- 可配置爬取深度\n- 自定义请求延迟时间\n- 支持日志级别配置\n- 优雅的退出机制\n- 自动过滤无效链接\n- SQLite数据存储\n\n## 环境要求\n\n- Python 3.6+\n- PyQt5\n- 其他依赖请查看 requirements.txt\n\n## 安装方法\n\n1. 克隆项目代码：\n\n```bash\ngit clone https://github.com/seaung/suspider.git\ncd suspider\n```\n\n2. 安装依赖：\n\n```bash\npip install -r requirements.txt\n```\n\n## 使用方法\n\n### 基本用法\n\n```bash\npython main.py \u003curl\u003e\n```\n\n### 命令行参数\n\n- `url`: 要爬取的网站URL（必需）\n- `-d, --depth`: 爬取深度，默认为3\n- `-t, --delay`: 请求延迟时间（秒），默认为1.0\n- `--log-level`: 日志级别，可选值：DEBUG、INFO、WARNING、ERROR、CRITICAL，默认为INFO\n\n### 示例\n\n```bash\n# 使用默认配置爬取网站\npython main.py https://example.com\n\n# 设置爬取深度为5，延迟2秒\npython main.py https://example.com -d 5 -t 2.0\n\n# 设置日志级别为DEBUG\npython main.py https://example.com --log-level DEBUG\n```\n\n## 注意事项\n\n1. 请遵守网站的robots.txt规则\n2. 建议设置适当的请求延迟，避免对目标网站造成压力\n3. 爬取深度越大，耗时越长，请根据实际需求设置","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseaung%2Fsuspider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseaung%2Fsuspider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseaung%2Fsuspider/lists"}