{"id":19989917,"url":"https://github.com/ma6254/FictionDown","last_synced_at":"2025-05-04T09:33:49.569Z","repository":{"id":46450857,"uuid":"173538919","full_name":"ma6254/FictionDown","owner":"ma6254","description":"小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对","archived":false,"fork":false,"pushed_at":"2024-03-06T11:49:37.000Z","size":470,"stargazers_count":734,"open_issues_count":4,"forks_count":145,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-11-13T04:52:01.447Z","etag":null,"topics":["biquge","crawler","fiction","golang","novels","qidian","spider"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ma6254.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-03-03T06:22:01.000Z","updated_at":"2024-11-10T18:50:04.000Z","dependencies_parsed_at":"2023-12-24T06:28:35.446Z","dependency_job_id":"78d427b9-c0e2-4a08-96db-f5af7a1c5f64","html_url":"https://github.com/ma6254/FictionDown","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma6254%2FFictionDown","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma6254%2FFictionDown/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma6254%2FFictionDown/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ma6254%2FFictionDown/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ma6254","download_url":"https://codeload.github.com/ma6254/FictionDown/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252316695,"owners_count":21728521,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["biquge","crawler","fiction","golang","novels","qidian","spider"],"created_at":"2024-11-13T04:50:47.340Z","updated_at":"2025-05-04T09:33:49.197Z","avatar_url":"https://github.com/ma6254.png","language":"Go","funding_links":[],"categories":["Go"],"sub_categories":[],"readme":"# FictionDown\n\nFictionDown 是一个命令行界面的小说爬取工具\n\n**用于批量下载盗版网络小说，该软件仅用于数据分析的样本采集，请勿用于其他用途**\n\n**该软件所产生的文档请勿传播，请勿用于数据评估外的其他用途**\n\n[![License](https://img.shields.io/github/license/ma6254/FictionDown.svg)](https://raw.githubusercontent.com/ma6254/FictionDown/master/LICENSE)\n[![release_version](https://img.shields.io/github/release/ma6254/FictionDown.svg)](https://github.com/ma6254/FictionDown/releases)\n[![last-commit](https://img.shields.io/github/last-commit/ma6254/FictionDown.svg)](https://github.com/ma6254/FictionDown/commits)\n[![Download Count](https://img.shields.io/github/downloads/ma6254/FictionDown/total.svg)](https://github.com/ma6254/FictionDown/releases)\n[![goproxy.cn](https://goproxy.cn/stats/github.com/ma6254/FictionDown/badges/download-count.svg)](https://goproxy.cn)\n\n[![godoc](https://img.shields.io/badge/godoc-reference-blue.svg)](https://pkg.go.dev/github.com/ma6254/FictionDown/)\n[![QQ 群](https://img.shields.io/badge/QQ%E7%BE%A4-495477288-orange.svg)](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027\u0026k=DkzYlCZ9VSQEq6CqUtqGiqYBZh1V5CKK\u0026authKey=btu30mBqaqx6GSVS3futp%2BhYitMfhtAltmp%2B84Kob9xS%2F6J5yQkd0dSeozzxbclT\u0026noverify=0\u0026group_code=495477288)\n\n[![Go](https://github.com/ma6254/FictionDown/workflows/Go/badge.svg)](https://github.com/ma6254/FictionDown/actions/runs/39839114)\n[![travis-ci](https://www.travis-ci.org/ma6254/FictionDown.svg?branch=master)](https://travis-ci.org/ma6254/FictionDown)\n[![Go Report Card](https://goreportcard.com/badge/github.com/ma6254/FictionDown)](https://goreportcard.com/report/github.com/ma6254/FictionDown)\n\n## 文档\n\n文档目前「指南」部分已完成，你可以在[这里](https://ma6254.github.io/FictionDown/)查看。\n\n## 特性\n\n- 以起点为样本，多站点多线程爬取校对\n- 支持导出 txt，以兼容大多数阅读器\n- 支持导出 epub(还有些问题，某些阅读器无法打开)\n- 支持导出 markdown，可以用 pandoc 转换成 epub，附带 epub 的`metadata`，保留书本信息、卷结构、作者信息\n- 内置简单的广告过滤（现在还不完善）\n- 用 Golang 编写，安装部署方便，可选的外部依赖：Chromedp\n- 支持断点续爬，强制结束再爬会在上次结束的地方继续\n\n## 站点支持\n\n- 是否正版：✅ 为正版站点 ❌ 为盗版站点\n- 是否分卷：✅ 章节分卷 ❌ 所有章节放在一个卷中不分卷\n- 站内搜索：✅ 完全支持 ❌ 不支持 ❔ 站点支持但软件未适配 ⚠️ 站点支持，但不可用或维护中 ⛔ 站点支持搜索，但没有好的适配方案（比如用 Google 做站内搜索）\n\n| 站点名称     | 网址              | 是否正版 | 是否分卷 | 支持站内搜索 | 代码文件                       |\n| ------------ | ----------------- | -------- | -------- | ------------ | ------------------------------ |\n| 起点中文网   | www.qidian.com    | ✅       | ✅       | ✅           | sites\\com_qidian\\main.go           |\n| 笔趣阁       | www.b520.cc | ❌       | ❌       | ✅           | sites\\cc_b520\\main.go    |\n| 顶点小说     | www.ddyueshu.com   | ❌       | ❌       | ✅           | sites\\com_ddyueshu\\main.go      |\n| 全本小说网     | www.qb5.la   | ❌       | ❌       | ✅           | sites\\la_qb5\\main.go      |\n| 新八一中文网 | www.81new.net     | ❌       | ❌       | ✅           | sites\\net_new81\\main.go            |\n| 书迷楼       | www.shumil.co     | ❌       | ❌       | ✅           | sites\\co_shumil\\main.go        |\n| 完本神站     | www.wanben.org | ❌       | ❌       | ✅           | site\\org_wanben\\main.go          |\n| 38 看书      | www.mijiashe.com  | ❌       | ❌       | ⚠️           | sites\\com_mijiashe\\main.go |\n\n## 使用注意\n\n- 起点和盗版站的页面可能随时更改，可能会使抓取匹配失效，如果失效请提 issue\n- 生成的 EPUB 文件可能过大，市面上大多数阅读器会异常卡顿或者直接崩溃\n- 某些过于老的书或者作者频繁修改的书，盗版站都没有收录，也就无法爬取，如能找此书可用的盗版站请提 issue，并写出书名和正版站链接、盗版站链接\n\n## 工作流程\n\n1. 输入起点链接\n2. 获取到书本信息，开始爬取每章内容，遇到 vip 章节放入`Example`中作为校对样本\n3. 手动设置笔趣阁等盗版小说的对应链接，`tamp`字段\n4. 再次启动，开始爬取，只爬取 VIP 部分，并跟`Example`进行校对\n5. 手动编辑对应的缓存文件，手动删除广告和某些随机字符(有部分是关键字,可能会导致 pandoc 内存溢出或者样式错误)\n6. `conv -f md`生成 markwown\n7. 用 pandoc 转换成 epub，`pandoc -o xxxx.epub xxxx.md`\n\n### Example\n\n```bash\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 d # 获取正版信息\n\n# 有时会发生`not match volumes`的错误，请启用Chromedp或者PhantomJS\n# Use Chromedp\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 -d chromedp d\n# Use PhantomJS\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 -d phantomjs d\n\n\u003e vim 一世之尊.FictionDown # 加入盗版小说链接\n\u003e ./FictionDown -i 一世之尊.FictionDown d # 获取盗版内容\n# 爬取完毕就可以输出可阅读的文档了\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f txt\n# 转换成epub有两种方式\n# 1.输出markdown，再用pandoc转换成epub\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f md\n\u003e pandoc -o 一世之尊.epub 一世之尊.md\n# 某些阅读器需要对章节进行定位,需要加上--epub-chapter-level=2\n\u003e pandoc -o 一世之尊.epub --epub-chapter-level=2 一世之尊.md\n# 2.直接输出epub（调用Pandoc）\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f epub\n```\n\n#### 可直接根据搜索结果直接下载（当存在至少一个正版源时可用）\n\n```bash\n\u003e ./FictionDown s -d -k \"诡秘之主\"\n```\n\n#### 站内搜索，然后填入\n\n```bash\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 d # 获取正版信息\n\n# 有时会发生`not match volumes`的错误，请启用Chromedp或者PhantomJS\n# Use Chromedp\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 --driver chromedp d\n# Use PhantomJS\n\u003e ./FictionDown --url https://book.qidian.com/info/3249362 --driver phantomjs d\n\n\u003e ./FictionDown -i 一世之尊.FictionDown s -k 一世之尊 -p # 搜索然后放入\n\u003e ./FictionDown -i 一世之尊.FictionDown d # 获取盗版内容\n# 爬取完毕就可以输出可阅读的文档了\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f txt\n# 转换成epub有两种方式\n# 1.输出markdown，再用pandoc转换成epub\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f md\n\u003e pandoc -o 一世之尊.epub 一世之尊.md\n# 2.直接输出epub（某些阅读器会报错）\n\u003e ./FictionDown -i 一世之尊.FictionDown conv -f epub\n```\n\n## 未实现\n\n- 爬取正版的时候带上`Cookie`，用于爬取已购买章节\n- 支持 晋江文学城\n- 支持 纵横中文网\n- 支持有毒小说网\n- 支持刺猬猫（即“欢乐书客”）\n- 整理 main 包中的面条逻辑\n- 整理命令行参数风格\n- 完善广告过滤\n- 简化使用步骤\n- 优化 log 输出\n- 对于特殊章节，支持手动指定盗版链接或者跳过忽略\n- 外部加载匹配规则，让用户可以自己添加正/盗版源\n- 支持章节更新\n- 章节匹配过程优化\n\n## Usage\n\n```bash\nNAME:\n   FictionDown - https://github.com/ma6254/FictionDown\n\nUSAGE:\n    [global options] command [command options] [arguments...]\n\nAUTHOR:\n   ma6254 \u003c9a6c5609806a@gmail.com\u003e\n\nCOMMANDS:\n     download, d, down  下载缓存文件\n     check, c, chk      检查缓存文件\n     edit, e            对缓存文件进行手动修改\n     convert, conv      转换格式输出\n     pirate, p          检索盗版站点\n     search, s          检索盗版站点\n     help, h            Shows a list of commands or help for one command\n\nGLOBAL OPTIONS:\n   -u value, --url value     图书链接\n   --tu value, --turl value  资源网站链接\n   -i value, --input value   输入缓存文件\n   --log value               log file path\n   --driver value, -d value  请求方式,support: none,phantomjs,chromedp\n   --help, -h                show help\n   --version, -v             print the version\n```\n\n## 安装和编译\n\n程序为单执行文件，命令行 CLI 界面\n\n包管理为 gomod\n\n```bash\ngo install github.com/ma6254/FictionDown@latest\n```\n\n交叉编译这几个平台的可执行文件：`linux/arm` `linux/amd64` `darwin/amd64` `windows/amd64`\n\n```bash\nmake multiple_build\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fma6254%2FFictionDown","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fma6254%2FFictionDown","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fma6254%2FFictionDown/lists"}