{"id":46613044,"url":"https://github.com/cmy2008/doc88_extractor","last_synced_at":"2026-03-07T19:02:46.670Z","repository":{"id":247274405,"uuid":"825420457","full_name":"cmy2008/doc88_extractor","owner":"cmy2008","description":"道客巴巴文档无损提取工具（非截图） A tool to extract and convert doc88 documents (non-screenshot).","archived":false,"fork":false,"pushed_at":"2026-01-03T07:42:40.000Z","size":137,"stargazers_count":112,"open_issues_count":4,"forks_count":28,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-09T03:19:13.924Z","etag":null,"topics":["doc88","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cmy2008.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-07-07T18:08:21.000Z","updated_at":"2026-01-08T14:04:58.000Z","dependencies_parsed_at":"2024-11-30T13:18:13.600Z","dependency_job_id":"63006ee9-e995-4d7c-90fa-c508101dffd0","html_url":"https://github.com/cmy2008/doc88_extractor","commit_stats":null,"previous_names":["cmy2008/doc88_extractor"],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/cmy2008/doc88_extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmy2008%2Fdoc88_extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmy2008%2Fdoc88_extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmy2008%2Fdoc88_extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmy2008%2Fdoc88_extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cmy2008","download_url":"https://codeload.github.com/cmy2008/doc88_extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmy2008%2Fdoc88_extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30226780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-07T19:01:10.287Z","status":"ssl_error","status_checked_at":"2026-03-07T18:59:58.103Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["doc88","python"],"created_at":"2026-03-07T19:02:46.237Z","updated_at":"2026-03-07T19:02:46.656Z","avatar_url":"https://github.com/cmy2008.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## 简介 / Introduction\n\n一个可以完整提取道客巴巴预览文档（非截图）的工具。  \nA tool to extract and convert doc88 documents (non-screenshot).\n\n\n## 特点 / Features\n\n- 利用 [JPEXS Free Flash Decompiler](https://github.com/jindrapetrik/jpexs-decompiler) (以下简称 ffdec) 工具，几乎完美转换文档，保留原始文本、形状与图片。  \n    Powered by [JPEXS Free Flash Decompiler](https://github.com/jindrapetrik/jpexs-decompiler), this tool preserves original text, shapes, and images—almost identical to the source.\n- 适用文档范围：几乎所有  \n    It's available for almost all documents.\n    \n## 安装 / Installation\n\n### Python\n\n- 需要 Python 3.10 或更高版本。  \n    Requires Python 3.10 or newer.\n\n安装依赖：\n\n```bash\npip3 install retrying pypdf requests\n```\n\n### Java\n\n- 需要安装 Java 才能进行文档转换（推荐 Java 17）:\n    \u003cbr\u003eRequires Java (recommended: version 17):\n    \u003cbr\u003e[Microsoft Build of OpenJDK 17 for Windows x64](https://aka.ms/download-jdk/microsoft-jdk-17.0.14-windows-x64.msi)\n\n### SVG 转换 / SVG Converting\n- 若启用 swf2svg，程序将自动下载 swf2svg 以实现 SVG 到 PDF 的转换。若安装失败，可尝试从 [typst/svg2pdf](https://github.com/typst/svg2pdf) 编译。  \n    If swf2svg is enabled, the tool will download swf2svg automatically to perform SVG-to-PDF conversion. if installation fails, try building it from [typst/svg2pdf](https://github.com/typst/svg2pdf).\n\n- 支持平台 / support platform:  \n    Windows (x86_64) / Linux (x86_64/arm64) / MacOS (x86_64/arm64) / Android (arm64)\n\n## 如何使用 / How to Use\n\n在程序目录下运行：\n\n```bash\npython3 main.py\n```\n\n- 控制台输入网址并回车。  \n    Enter the URL in the console.\n- 首次运行会生成配置文件，检测更新并下载 ffdec。  \n    On first run, there will be a configuration file `config.json`, then check the updates and download the ffdec.\n\n\n## 配置 / Configuration\n### 说明 / Description\n默认情况下配置在 `config.json` 文件中，主要说明如下：\n\n| 键名 / Key         | 说明                                                              | Description                                                                   |\n| ------------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------- |\n| `proxy_url`        | Github 代理服务的 URL                                             | The URL of Github's proxy service.                                            |\n| `check_update`     | 是否在启动时检查更新                                              | Always check updates on startup.                                              |\n| `swf2svg`          | 是否先转换到 SVG 再转到 PDF                                       | Convert swf files to svg first.                                               |\n| `svgfontface`      | （仅 swf2pdf 为 false 时有效）在 SVG 转换中是否转换字体来呈现文本 | Only works when swf2pdf is false; using font to show texts in SVG converting. |\n| `fix_displayrect`  | 是否修正 SWF 的画布大小                                           | Fix the swf files displayrect sizes                                           |\n| `clean`            | 是否保留中间文件                                                  | Keep intermediate files.                                                      |\n| `get_more`         | 是否始终通过扫描获取页面                                          | Always via scanning to get pages.                                             |\n| `path_replace`     | 是否在 Windows 下替换过长路径                                     | Replace long paths on Windows.                                                |\n| `download_workers` | 下载文件的线程数                                                  | Number of threads for downloading files.                                      |\n| `convert_workers`  | 转换文件的线程数                                                  | Number of threads for converting files.                                       |\n| `pdf_scale`        | 转换为 PDF 的缩放大小                                             | Scale of PDF  converting.                                         |            \n\n### 注意事项 / Attention\n- 使用 `fix_displayrect` 选项，可以修复某些少数文档的长宽不一致导致的转换问题\n- 使用 `swf2svg` 选项，也许会解决部分文档的字体形状问题（不能解决字体不全的问题，原始文件为了压缩大小，减去了未使用的字）\n- 使用 `swf2svg` 选项，而不使用 `svgfontface` 选项，由于省去了文本转换过程，可以大大加快转换速度\n- 若启用 `svgfontface` 选项，由于 [typst/svg2pdf](https://github.com/typst/svg2pdf) 的缺陷，将无法转换字体，会自动替换为默认字体\n- 若启用 `svgfontface` 选项，由于 [ffdec](https://github.com/jindrapetrik/jpexs-decompiler) 的缺陷，某些形状或文本会出现转换错误\n- 为防止转换出的部分字体过粗，`pdf_scale` 被默认设置为 `2.0`，这会稍微减缓转换速度，如需加快转换速度，可将此项设置为 `1.0` 或更小的值（对于粗体或本身较粗的字体影响大，对细体几乎无影响），若仍然出现部分字体变粗，可尝试修改为更大的值（`2.0`-`5.0`）","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmy2008%2Fdoc88_extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmy2008%2Fdoc88_extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmy2008%2Fdoc88_extractor/lists"}