{"id":18296453,"url":"https://github.com/benderpan/searchonline","last_synced_at":"2025-04-09T08:43:21.130Z","repository":{"id":116268237,"uuid":"115857036","full_name":"BenDerPan/SearchOnline","owner":"BenDerPan","description":"实现搜索引擎搜索内容回传以及制定URL内容加载回传，适合作为远程插件功能使用。","archived":false,"fork":false,"pushed_at":"2018-01-02T08:44:01.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-15T02:46:10.466Z","etag":null,"topics":["baidu","bing","engine","google","search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BenDerPan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-31T10:36:31.000Z","updated_at":"2017-12-31T11:34:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"d0b1e996-e026-441f-bc54-674d52340866","html_url":"https://github.com/BenDerPan/SearchOnline","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BenDerPan%2FSearchOnline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BenDerPan%2FSearchOnline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BenDerPan%2FSearchOnline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BenDerPan%2FSearchOnline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BenDerPan","download_url":"https://codeload.github.com/BenDerPan/SearchOnline/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248008555,"owners_count":21032553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baidu","bing","engine","google","search"],"created_at":"2024-11-05T14:41:03.974Z","updated_at":"2025-04-09T08:43:21.109Z","avatar_url":"https://github.com/BenDerPan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SearchOnline [基于Python3.5，其他python版本未测试]\n实现搜索引擎搜索内容回传以及制定URL内容加载回传，适合作为远程插件功能使用。\n\n## 使用方法\n* 安装Python 依赖库：`pip install -r requirements.txt`\n* 搜索功能：\n    ```python \n    searchResults=WebPageOnlineEngine.search(\"习大大\",search_engines=[\"baidu\"],num_pages_for_keyword=1)\n    print(json.dumps(searchResults,indent=4,ensure_ascii=False))\n    ```\n    结果输出：\n    \n    ```json\n    {\n        \"baidu\": {\n            \"query\": \"习大大\",\n            \"pages\": {\n                \"1\": {\n                    \"scrape_method\": \"http\",\n                    \"requested_at\": 1514689866,\n                    \"status\": \"successful\",\n                    \"num_results_for_query\": \"搜索工具百度为您找到相关结果约11,300,000个\",\n                    \"links\": [\n                        {\n                            \"rank\": 1,\n                            \"domain\": \"www.baidu.com\",\n                            \"title\": \"习大大的“亲民范儿”——十三张图告诉你有多“暖暖哒”_央广网\",\n                            \"snippet\": \"2015年8月5日 - “亲吻芦山地震灾区男孩”“大雨中挽裤腿自己撑伞”“吃‘红军饭’时给战士夹菜”等一幕幕场景更是全面地让大家领略到了“习大大”朴实亲民的领导风格...\",\n                            \"visible_link\": null,\n                            \"link\": \"http://www.baidu.com/link?url=kkX4cfyK-tCNlwfcdH1T8UHm3lOukNdK55DIpyZTo3O_I1hbAFxzct2cW7B3hw06UiMhkG7_gZMG-1dfF3MyBOdyIOmaSbsBKUsMdtwjMpC\"\n                        },\n                        {\n                            \"rank\": 2,\n                            \"domain\": \"www.baidu.com\",\n                            \"title\": \"被叫“习大大” 总书记笑了_网易财经\",\n                            \"snippet\": \"2014年9月10日 - 潘聿航提到,当时牌子的内容有“习总书记辛苦了”和“习大大辛苦了”两个备选。“曾经犹豫了一番,担心用 习大大 这三个字欠妥。”但他们想,总书...\",\n                            \"visible_link\": null,\n                            \"link\": \"http://www.baidu.com/link?url=5tBlgVr6Sj0HZWOBdiSNZx6ls8G6I5ZKOKdfVzQXAQ3Bxn7DQwahn7mTe2CPCRKpRafI5a-ujrYRkxh-Tgyo-K\"\n                        }\n                    ]\n                }\n              //...此处省略若干\n            },\n            \"num_results\": 10\n        }\n    }\n    ```\n* 加载页面内容功能：\n    ```python\n    urls = [\n            \"https://arxiv.org/pdf/1710.00811.pdf\",\n            'http://blog.csdn.net/nero_g/article/details/52912305',\n            'https://gss1.bdstatic.com/9vo3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike150%2C5%2C5%2C150%2C50/sign=c05506e79482d158af8f51e3e16372bd/c2fdfc039245d688c56332adacc27d1ed21b2451.jpg'\n        ]\n    for url in urls:\n        urlData = WebPageOnlineEngine.get_url_content(url)\n        print(json.dumps(urlData,indent=4,ensure_ascii=False))\n    ```\n    \n    输出结果：\n    \n    ```json\n    {\n        \"file_extension\": \".htm\",   #URL对应的文件类型\n        \"error\": 0,       #错误码，0-表示正常，其他表示出现错误\n        \"error_msg\": \"\",   #若出现错误，错误消息内容\n        \"url\": \"http://blog.csdn.net/nero_g/article/details/52912305\",   #加载内容原始URL地址\n        \"b64_data\": \"Cgo8IURPQ1RZU....==\",     #URL对应内容base64编码字符串，需要对应解码\n        \"content_type_origin\": \"text/html; charset=utf-8\",    #原始URL请求Response　Header头部原始Content-Type类型\n        \"time\": 1514717424,    #内容返回处理的时间戳\n        \"content_type\": \"text/html\",     #原始URL请求Response　Header头部Content-Type   内容类型，不含其他附加值\n        \"status\": 200    #URL访问返回的Http Status Code\n    }\n    ```\n* 完整代码参考 `web_page_online.py`\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenderpan%2Fsearchonline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenderpan%2Fsearchonline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenderpan%2Fsearchonline/lists"}