{"id":13670275,"url":"https://github.com/walker02/yunshare","last_synced_at":"2025-04-27T09:32:15.421Z","repository":{"id":41561271,"uuid":"74126712","full_name":"walker02/yunshare","owner":"walker02","description":"百度云分享爬虫项目","archived":false,"fork":false,"pushed_at":"2016-11-18T03:19:41.000Z","size":2462,"stargazers_count":32,"open_issues_count":0,"forks_count":143,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-08-03T09:07:19.299Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/walker02.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-11-18T12:10:41.000Z","updated_at":"2023-09-21T04:55:45.000Z","dependencies_parsed_at":"2022-07-07T18:28:27.998Z","dependency_job_id":null,"html_url":"https://github.com/walker02/yunshare","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walker02%2Fyunshare","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walker02%2Fyunshare/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walker02%2Fyunshare/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/walker02%2Fyunshare/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/walker02","download_url":"https://codeload.github.com/walker02/yunshare/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224067069,"owners_count":17250106,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T09:00:37.469Z","updated_at":"2024-11-11T07:31:20.557Z","avatar_url":"https://github.com/walker02.png","language":"HTML","readme":"# 百度云分享爬虫项目\n\ngithub上有好几个这样的开源项目，但是都只提供了爬虫部分，这个项目在爬虫的基础上还增加了保存数据，建立elasticsearch索引的模块，可以用在实际生产环境中，不过web模块还是需要自己开发\n\n## 安装\n\n安装node.js和pm2，node用来运行爬虫程序和索引程序，pm2用来管理node任务\n\n安装mysql和mongodb，mysql用来保存爬虫数据，mongodb用来保存最终的百度云分享数据，这些数据是json格式的，用mongodb保存更方便。\n\n```\ngit clone https://github.com/callmelanmao/yunshare\ncnpm i\n```\n\n推荐使用cnpm命令安装npm依赖，最简单的安装方式\n\n```\n$ npm install -g cnpm --registry=https://registry.npm.taobao.org\n```\n\n更多安装cnpm的命令可以去[npm.taobao.org](http://npm.taobao.org/)上面找。\n\n\n## 初始化\n\n爬虫数据（主要是url列表）都是保存在mysql数据库的，yunshare使用sequelizejs做orm映射，源文件在`src/models/index.js`，默认的mysql用户名和密码都是root，数据看是yun，你需要手动创建yun数据库\n\n```\ncreate database yun default charset utf8\n```\n\n密码根据自己需要进行修改，完成mysql配置之后就可以运行下面的命令\n\n```\ngulp babel\nnode dist/init.js\n```\n\n注意必须先运行`gulp babel`把es6代码编译成es5，然后运行初始化脚本导入初始数据，数据文件在`data/hot.json`，里面，是从页面 http://yun.baidu.com/pcloud/friend/gethotuserlist?type=1\u0026from=feed\u0026start=0\u0026limit=24\u0026bdstoken=ac95ef31d3979f6ee707ef75cee9f5c5\u0026clienttype=0\u0026web=1 保存下来的。\n\n## 启动项目\n\nyunshare使用pm2进行nodejs进程管理，运行`pm2 start process.json`启动所有的后台任务，检查任务是否正常运行可以用命令`pm2 list`，正常运行的应该有4个任务。\n\n## 启动elasticsearch索引\n\nelasticsearch索引程序也已经写好了，mapping文件在`data/mapping.json`，请确保你已经安装elasticsearch 5.0的版本之后才运行索引程序，命令`pm2 start dist/elastic.js`。\n\n默认的elasticsearch地址是http://localhost:9200，如果你需要修改这个地址，可以在`src/ElasticWorker.js`里面修改，修改任何js源码之后记得运行`gulp babel`，在重启pm2任务，不然修改是不会生效的。\n\n在完成elasticsearch配置之后，你也可以在process.json里面添加一项elastic任务，这样就不需要单独启动索引程序了。\n\n## DEMO\n\n[网盘搜索](https://biliworld.com)\n","funding_links":[],"categories":["HTML"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwalker02%2Fyunshare","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwalker02%2Fyunshare","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwalker02%2Fyunshare/lists"}