{"id":23111840,"url":"https://github.com/dext7r/juejinbooksspider","last_synced_at":"2025-10-08T04:54:28.608Z","repository":{"id":186784391,"uuid":"675646161","full_name":"dext7r/juejinBooksSpider","owner":"dext7r","description":"掘金小册爬虫","archived":false,"fork":false,"pushed_at":"2025-03-28T02:21:33.000Z","size":51561,"stargazers_count":6,"open_issues_count":1,"forks_count":2,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-08-16T19:34:31.146Z","etag":null,"topics":["juejin","juejinbooks","juejinbooksspider","juejinspider","puppeteer"],"latest_commit_sha":null,"homepage":"https://dext7r.github.io/juejinBooksSpider/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dext7r.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-07T11:57:24.000Z","updated_at":"2025-06-11T10:53:56.000Z","dependencies_parsed_at":"2023-08-07T18:32:45.035Z","dependency_job_id":"4c7324e8-7946-4bc1-b55d-0266b89e3c47","html_url":"https://github.com/dext7r/juejinBooksSpider","commit_stats":null,"previous_names":["h7ml/juejinbooksspider","wiederhoeft/juejinbooksspider","dext7r/juejinbooksspider"],"tags_count":0,"template":true,"template_full_name":null,"purl":"pkg:github/dext7r/juejinBooksSpider","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dext7r%2FjuejinBooksSpider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dext7r%2FjuejinBooksSpider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dext7r%2FjuejinBooksSpider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dext7r%2FjuejinBooksSpider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dext7r","download_url":"https://codeload.github.com/dext7r/juejinBooksSpider/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dext7r%2FjuejinBooksSpider/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278891746,"owners_count":26063855,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["juejin","juejinbooks","juejinbooksspider","juejinspider","puppeteer"],"created_at":"2024-12-17T02:11:16.757Z","updated_at":"2025-10-08T04:54:28.593Z","avatar_url":"https://github.com/dext7r.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e📚 掘金小册爬虫 👋\u003c/h1\u003e\n\u003cp\u003e\n  \u003cimg alt=\"Version\" src=\"https://img.shields.io/badge/version-1.0.0-blue.svg?docsSeconds=2592000\" /\u003e\n  \u003ca href=\"https://github.com/h7ml/juejinBooksSpider#readme\" target=\"_blank\"\u003e\n    \u003cimg alt=\"Documentation\" src=\"https://img.shields.io/badge/documentation-yes-brightgreen.svg\" /\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/h7ml/juejinBooksSpider/graphs/commit-activity\" target=\"_blank\"\u003e\n    \u003cimg alt=\"Maintenance\" src=\"https://img.shields.io/badge/Maintained%3F-yes-green.svg\" /\u003e\n  \u003c/a\u003e\n  \u003ca href=\"./docs/\" target=\"_blank\"\u003e\n    \u003cimg alt=\"License: Apache--2.0\" src=\"https://nakoruru.h7ml.cn/proxy/img.shields.io/badge/小册阅读-4ABF8A?logo=Blovin\u0026logoColor=fff\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003e 🕷️ 掘金小册爬虫脚本。将小册保存为 markdown，pdf，html 格式\n\n## 📜 说明\n\n[本项目案例](https://h7ml.github.io/juejinBooksSpider/docs/)使用爬虫爬取的为公开的掘金小册。可在[掘金小册/阅读](https://juejin.cn/course/article) 中查看。本项目仅供学习交流使用，请勿将个人付费小册公开。⚠️ 若公开由此造成的一切后果，与本项目无关。\n\n## 🛠 使用\n\n### 👥 clone 项目\n\n```bash\ngit clone https://github.com/h7ml/juejinBooksSpider.git\ncd juejinBooksSpider\n```\n\n### 📦 install 依赖\n\n```bash\npnpm install\n\n# or\n# npm install\n\n# or\n# yarn install\n```\n\n### 🎲 运行\n\n```bash\n# 爬取单本小册\n# pnpm dev \u003c小册地址\u003e\npnpm dev https://juejin.cn/book/6844723704639782920\n\n# 爬取多本小册 需要配置cookie 并且设置spiderAll为true 到.env文件。然后执行 pnpm start 即可\n\n```\n\n### 📁 配置文件说明\n\n#### 📋 类型定义\n\n```ts\n// \\src\\types.d.ts\nexport type FileFormat = 'pdf' | 'md' | 'html' | ''\n\nexport interface EvConfig {\n  log: string | boolean\n  storeDirs: string\n  cookie: string\n  course: string\n  spiderAll: string | boolean\n  headless: string | boolean\n  filetype: FileFormat\n  puppeteerOptions: PuppeteerLaunchOptions\n}\n```\n\n#### ⚙️ .env\n\n- `cookie`：掘金网站的 Cookie，用于爬取授权访问的小册。\n- `isLog`：是否输出日志形式，默认为 `true`。开启后将在`dist`目录下产生`log`文件。\n- `storeDir`：小册保存的目录，默认为`docs`。表示当前目录下的`docs`目录。\n- `course`：小册地址，默认为`https://juejin.cn/book/6844723704639782920`。若命令行中传入了小册地址，则以命令行中的地址为准。\n- `spiderAll`：是否爬取所有小册，默认为`false`。若为`true`，则会爬取所有小册，否则只爬取`course`中指定的小册。\n- `filetype`: 保存的文件类型，默认为`md`。可选值为`md`、`pdf`、`html`。\n- `headless`: 是否使用无头浏览器，默认为`true`。若为`false`，则会使用有头浏览器，方便调试。文档参考：[puppeteer](https://pptr.dev/troubleshooting/#chrome-headless-disables-gpu-compositing)\n\n#### ⚙️ `puppeteerOptions`\n\n`puppeteerOptions` 为`puppeteer`的启动参数，非必须。文档参考：[puppeteer](https://pptr.dev/browsers-api/browsers.launchoptions/) 如需修改。请在[config](src/config/index.ts) 中配置\n\n- 若你在`wsl` 中使用，需要安装`google-chrome` 然后配置`puppeteerOptions`参数为`{executablePath: 'google-chrome'}` 即可。文档参考[install-google-chrome-wsl](https://www.tiredsg.dev/blog/install-google-chrome-wsl/) [@croatialu](https://github.com/croatialu)\n\n- 感谢 [@croatialu](https://github.com/croatialu) [@maomao1996](https://github.com/maomao1996) [@Dnzzk2](https://github.com/Dnzzk2) 提供了灵感和建议\n\n### 🏠 [主页](https://h7ml.github.io/juejinBooksSpider?t=1)\n\n## 👤 作者\n\n👤 **h7ml**\n\n- Github: [@h7ml](https://github.com/h7ml)\n\n## 🤝 贡献者\n\n贡献、问题和功能请求都受到欢迎！\u003cbr /\u003e欢迎[提出问题和建议](https://github.com/h7ml/juejinBooksSpider/issues/new). 您也可以查阅 [贡献指南](https://github.com/h7ml/juejinBooksSpider/blob/master/CONTRIBUTING.md).\n\n\u003c!-- CONTRIBUTION GROUP --\u003e\n\n\u003e 📊 Total: \u003ckbd\u003e**17**\u003c/kbd\u003e\n\n\u003ca href=\"https://github.com/dextr7\" title=\"dextr7\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/167136498?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/actions-user\" title=\"actions-user\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/65916846?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/Binbiubiubiu\" title=\"Binbiubiubiu\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/26505011?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/croatialu\" title=\"croatialu\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/22277972?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/yyx990803\" title=\"yyx990803\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/499550?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/KelseyShi\" title=\"KelseyShi\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/56479000?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/Dnzzk2\" title=\"Dnzzk2\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/83647184?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/Michael-py001\" title=\"Michael-py001\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/60598432?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/sdras\" title=\"sdras\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/2281088?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/antfu\" title=\"antfu\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/11247099?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/gaearon\" title=\"gaearon\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/810438?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/apps/dependabot\" title=\"dependabot[bot]\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/in/29110?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/donghuzi1\" title=\"donghuzi1\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/50367089?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/h7ml\" title=\"h7ml\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/55233292?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/tiezhu111\" title=\"tiezhu111\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/92362753?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/ReAbout\" title=\"ReAbout\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/9333053?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/whatqiu\" title=\"whatqiu\"\u003e\n  \u003cimg src=\"https://avatars.githubusercontent.com/u/14936273?v=4\" width=\"50\" /\u003e\n\u003c/a\u003e\n\n\u003c!-- CONTRIBUTION END --\u003e\n\n## 📝 许可协议\n\n版权所有 © 2023 [h7ml](https://github.com/h7ml)。\u003cbr /\u003e\n本项目使用 [Apache--2.0](https://github.com/h7ml/juejinBooksSpider/blob/master/LICENSE) 许可协议。\n\n---\n\n_此 README 是通过 [readme-md-generator](https://github.com/kefranabg/readme-md-generator) ❤️ 生成的_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdext7r%2Fjuejinbooksspider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdext7r%2Fjuejinbooksspider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdext7r%2Fjuejinbooksspider/lists"}