{"id":23237702,"url":"https://github.com/gxr404/doc-dl","last_synced_at":"2025-08-20T16:04:03.051Z","repository":{"id":143944227,"uuid":"414798452","full_name":"gxr404/doc-dl","owner":"gxr404","description":"将文章以markdown的格式保存到本地","archived":false,"fork":false,"pushed_at":"2025-04-19T09:28:35.000Z","size":1996,"stargazers_count":9,"open_issues_count":0,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-19T15:26:26.570Z","etag":null,"topics":["article","cli","donwload","markdown","spider","spire"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gxr404.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-08T00:39:40.000Z","updated_at":"2025-04-19T09:28:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"cb9896ef-6d2d-4631-a01b-752538ba1f5a","html_url":"https://github.com/gxr404/doc-dl","commit_stats":null,"previous_names":["gxr404/doc-dl","gxr404/article-pull"],"tags_count":52,"template":false,"template_full_name":null,"purl":"pkg:github/gxr404/doc-dl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gxr404%2Fdoc-dl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gxr404%2Fdoc-dl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gxr404%2Fdoc-dl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gxr404%2Fdoc-dl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gxr404","download_url":"https://codeload.github.com/gxr404/doc-dl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gxr404%2Fdoc-dl/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263909001,"owners_count":23528563,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["article","cli","donwload","markdown","spider","spire"],"created_at":"2024-12-19T04:14:34.325Z","updated_at":"2025-07-06T13:37:48.490Z","avatar_url":"https://github.com/gxr404.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# doc-dl\n\n根据输入的文章url 抓取页面内容,并转成markdown，连同文章中的图片也给保存到本地\n\n![example](https://github.com/gxr404/doc-dl/assets/17134256/936d09f7-1212-421c-962f-0580492b7261)\n\n## 安装\n\n```shell\nnpm install -g doc-dl\n```\n\n```shell\nUsage: index [options]\n\nOptions:\n  -V, --version             output the version number\n  -u, --url \u003curl\u003e           文章url\n  -t, --title \u003ctitle\u003e       自定义文章标题\n  -d, --dist \u003cpath\u003e         生成的目录(eg: -d res)\n  -i, --img-dir \u003cpath\u003e      生成目录内图片目录(eg: -i ./img/20)\n  -H, --header \u003cheader...\u003e  与curl的-H参数一致, 用于自定义请求头\n  -l, --lax                 puppeteer的waitUntil, 宽松的请求[domcontentloaded, networkidle2], 默认严格的请求[load, networkidle0]\n  --timeout \u003ctimeout\u003e       图片下载超时时间, 默认0不设置超时时间\n  -h, --help                display help for command\n\nExamples:\n  $ custom-help --help\n  $ custom-help -h\n```\n\n## Usage\n\nurl文章链接支持大部分网站，如掘金/知乎文章/微信公众号文章...\n\n```shell\ndoc-dl -u \u003curl\u003e\n```\n\n## 该项目分以下三个包\n\n- [doc-dl](./packages/doc-dl/README.md) 核心包\n- [pull-md-img](./packages/pull-md-img/README.md) 下载markdown中的图片并更新markdown路径\n- [turndown](./packages/turndown/README.md) 转markdown turndown 插件\n\n## 注意\n\n大多数网站 直接`doc-dl -u \u003curl\u003e`即可\n\n有些比较刁钻的网站如 知乎 😂，就需要通过带上自定义header参数来提高成功率\n\n例如\n\n```bash\ndoc-dl -u \"https://zhuanlan.zhihu.com/p/10673225170\"\n# 打开markdown文件结果是 \"{\"error\":{\"message\":\"您当前请求存在异常，暂时限制本次访问。如有疑问，您可以通过手机摇一摇或登录后私信知乎小管家反馈。\",\"code\":40362}}\"\n```\n\n此时就需要通过 自定义请求头来实现\n\n1. 打开浏览器的devtool中network找到第一个请求(ps: 即是\"https://zhuanlan.zhihu.com/p/10673225170\" 这个请求)\n2. 右击Copy -\u003e \"Copy as cURL\"\n3. 黏贴到记事本\n\n```txt\ncurl 'https://zhuanlan.zhihu.com/p/10673225170' \\\n  -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7' \\\n  -H 'accept-language: zh-CN,zh;q=0.9' \\\n  -H 'cache-control: no-cache' \\\n  -H 'cookie: xxx' \\\n  -H 'pragma: no-cache' \\\n  -H 'priority: u=0, i' \\\n  -H 'referer: https://zhuanlan.zhihu.com/p/10673225170' \\\n  -H 'sec-ch-ua: \"Google Chrome\";v=\"125\", \"Chromium\";v=\"125\", \"Not.A/Brand\";v=\"24\"' \\\n  -H 'sec-ch-ua-mobile: ?0' \\\n  -H 'sec-ch-ua-platform: \"macOS\"' \\\n  -H 'sec-fetch-dest: document' \\\n  -H 'sec-fetch-mode: navigate' \\\n  -H 'sec-fetch-site: same-origin' \\\n  -H 'sec-fetch-user: ?1' \\\n  -H 'upgrade-insecure-requests: 1' \\\n  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'\n```\n\n4. 修改其中内容 `curl`替换 `doc-dl -u`\n\n```txt\ndoc-dl -u 'https://zhuanlan.zhihu.com/p/10673225170' \\\n  -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7' \\\n  -H 'accept-language: zh-CN,zh;q=0.9' \\\n  -H 'cache-control: no-cache' \\\n  -H 'cookie: xxx' \\\n  -H 'pragma: no-cache' \\\n  -H 'priority: u=0, i' \\\n  -H 'referer: https://zhuanlan.zhihu.com/p/10673225170' \\\n  -H 'sec-ch-ua: \"Google Chrome\";v=\"125\", \"Chromium\";v=\"125\", \"Not.A/Brand\";v=\"24\"' \\\n  -H 'sec-ch-ua-mobile: ?0' \\\n  -H 'sec-ch-ua-platform: \"macOS\"' \\\n  -H 'sec-fetch-dest: document' \\\n  -H 'sec-fetch-mode: navigate' \\\n  -H 'sec-fetch-site: same-origin' \\\n  -H 'sec-fetch-user: ?1' \\\n  -H 'upgrade-insecure-requests: 1' \\\n  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'\n```\n\n5. 最后把 修改后的内容 黏贴到终端运行即可","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgxr404%2Fdoc-dl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgxr404%2Fdoc-dl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgxr404%2Fdoc-dl/lists"}