{"id":13671708,"url":"https://github.com/BingKui/Crawler-Douban-Book","last_synced_at":"2025-04-27T18:31:35.714Z","repository":{"id":98502743,"uuid":"84285849","full_name":"BingKui/Crawler-Douban-Book","owner":"BingKui","description":"NodeJs爬取豆瓣书籍的数据，并保存进MongoDB数据库。","archived":false,"fork":false,"pushed_at":"2017-11-16T05:52:30.000Z","size":25,"stargazers_count":20,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-06T01:32:26.579Z","etag":null,"topics":["crawler-douban-book","express","log4j","mongolass","node","nodejs-mongodb"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BingKui.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-08T06:20:01.000Z","updated_at":"2023-05-04T06:27:25.000Z","dependencies_parsed_at":"2023-07-24T23:16:23.427Z","dependency_job_id":null,"html_url":"https://github.com/BingKui/Crawler-Douban-Book","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BingKui%2FCrawler-Douban-Book","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BingKui%2FCrawler-Douban-Book/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BingKui%2FCrawler-Douban-Book/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BingKui%2FCrawler-Douban-Book/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BingKui","download_url":"https://codeload.github.com/BingKui/Crawler-Douban-Book/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251187252,"owners_count":21549609,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler-douban-book","express","log4j","mongolass","node","nodejs-mongodb"],"created_at":"2024-08-02T09:01:16.854Z","updated_at":"2025-04-27T18:31:35.696Z","avatar_url":"https://github.com/BingKui.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Crawler-Douban-Book\n使用NodeJs+express+MongoDB实现的简单爬虫系统，用来爬取豆瓣读书的书籍数据，并保存进MongoDB数据库。\n\n## 通过本项目能学到什么？\n\n通过本项目能够学到以下知识。\n\n+ [爬虫前需要注意的地方。](./docs/robot.md)\n+ [基本的爬虫系统怎么实现。](./docs/splider.md)\n+ [request请求获取页面后怎么处理。](./docs/request.md)\n+ [express的基本使用和路由的使用。](./docs/express.md)\n+ [怎么用log4js来记录日志信息。](./docs/log4js.md)\n+ [mongoDB数据库简介与使用。](./docs/mongodb.md)\n+ [基本的项目结构设计。](./docs/mvc.md)\n+ [使用mongolass对数据库进行增删改查。](./docs/mongolass.md)\n\n\n## 环境\n\nNodeJs：`6.10.0`\n\nexpress：`4.15.2`\n\nmongoDB：`3.4.2`\n\n## 插件\n\ncheerio：`0.22.0`\n\n主要用来生成DOM树，方便操作页面。\n\nlog4js：`1.1.1`\n\n主要用来代替express的默认日志系统，用来记录日志。\n\nmongolass：`2.4.2`\n\n用来连接mongoDB数据库的驱动。\n\nrequest：`2.80.0`\n\n用来发送请求，获取页面内容。\n\n## 其他插件\n下列插件为开发测试使用，推荐安装。\n\nnode-dev：`3.1.3`\n\n用来测试自动重启服务，方便修改代码后及时测试。相应的也可以使用supervisor、nodemon等插件代替。\n## Dev\n**下载项目**\n\n\u003e git clone https://github.com/BingKui/Crawler-Douban-Book.git\n\n**进入目录安装相应依赖**\n\n\u003e cd Crawler-Douban-Book \u0026\u0026 npm install\n\n**修改配置**\n\n修改项目配置项，打开config目录下的config.js文件，修改数据库连接的配置和端口。\n\n修改下列的数据库地址为自己的数据库地址\n\u003e mongodb://localhost/splider\n\n**运行项目**\n\n\u003enpm run start\n\n注意：如果没有安装`node-dev`需要修改`package.json`中的`script`。\n\n**测试效果**\n\n打开浏览器，或者请求发送工具（如：[Postman](https://www.getpostman.com/)），打开`http://localhost:3000/tag`获取标签相应的数据，并保存进mongoDB数据库。\n\n## 目录说明\n```\n项目根目录/\n    ├── common //公共方法定义目录，修改代码后没用到\n    ├── config //配置文件目录，所有的配置文件都在这里\n    ├── controller //控制层目录，所有数据库的基础操作都在这里定义\n    ├── docs //文档目录，存放文档\n    ├── lib\t//库文件目录\n    ├── logs //日执行信息目录\n    ├── models //模型，定义数据模型\n    └── router //路由目录，控制所有的路由\n```\n\n## 路由说明\n\n`http://localhost:3000/tag`：获取标签，并保存进数据库。\n\n`http://localhost:3000/tag/update`：获取每个标签的页码总数，并对数据库中的数据进行更新。\n\n`http://localhost:3000/tagList`：获取每个标签包含的所有书籍数据，并保存进数据库。\n\n`http://localhost:3000/books`：获取书籍详细信息并保存进book表\n\n\n## 更新日志\n[更新日志](./docs/update.md)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBingKui%2FCrawler-Douban-Book","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBingKui%2FCrawler-Douban-Book","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBingKui%2FCrawler-Douban-Book/lists"}