{"id":18049376,"url":"https://github.com/Byaidu/PDFMathTranslate","last_synced_at":"2025-03-27T20:30:35.651Z","repository":{"id":255601813,"uuid":"853189791","full_name":"Byaidu/PDFMathTranslate","owner":"Byaidu","description":"PDF scientific paper translation and bilingual comparison - 完整保留排版的 PDF 文档全文双语翻译","archived":false,"fork":false,"pushed_at":"2024-10-28T06:58:34.000Z","size":42524,"stargazers_count":78,"open_issues_count":0,"forks_count":12,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-29T22:56:24.488Z","etag":null,"topics":["chinese","english","japanese","korean","latex","pdf","translation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Byaidu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-06T06:56:03.000Z","updated_at":"2024-10-29T08:44:30.000Z","dependencies_parsed_at":"2024-10-22T19:21:51.100Z","dependency_job_id":null,"html_url":"https://github.com/Byaidu/PDFMathTranslate","commit_stats":{"total_commits":72,"total_committers":1,"mean_commits":72.0,"dds":0.0,"last_synced_commit":"eb7d93c63ecb48928dd5a02d11542fec556aa170"},"previous_names":["byaidu/pdfmathtranslate"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Byaidu%2FPDFMathTranslate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Byaidu%2FPDFMathTranslate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Byaidu%2FPDFMathTranslate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Byaidu%2FPDFMathTranslate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Byaidu","download_url":"https://codeload.github.com/Byaidu/PDFMathTranslate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222308099,"owners_count":16964309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chinese","english","japanese","korean","latex","pdf","translation"],"created_at":"2024-10-30T21:01:39.281Z","updated_at":"2025-03-27T20:30:35.640Z","avatar_url":"https://github.com/Byaidu.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\nEnglish | [简体中文](docs/README_zh-CN.md) | [繁體中文](docs/README_zh-TW.md) | [日本語](docs/README_ja-JP.md) | [한국어](docs/README_ko-KR.md)\n\n\u003cimg src=\"./docs/images/banner.png\" width=\"320px\"  alt=\"PDF2ZH\"/\u003e\n\n\u003ch2 id=\"title\"\u003ePDFMathTranslate\u003c/h2\u003e\n\n\u003cp\u003e\n  \u003c!-- PyPI --\u003e\n  \u003ca href=\"https://pypi.org/project/pdf2zh/\"\u003e\n    \u003cimg src=\"https://img.shields.io/pypi/v/pdf2zh\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pepy.tech/projects/pdf2zh\"\u003e\n    \u003cimg src=\"https://static.pepy.tech/badge/pdf2zh\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://hub.docker.com/repository/docker/byaidu/pdf2zh\"\u003e\n    \u003cimg src=\"https://img.shields.io/docker/pulls/byaidu/pdf2zh\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://gitcode.com/Byaidu/PDFMathTranslate/overview\"\u003e\n    \u003cimg src=\"https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/ModelScope-Demo-blue\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/Byaidu/PDFMathTranslate/pulls\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/contributions-welcome-green\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://t.me/+Z9_SgnxmsmA5NzBl\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare\u0026logo=telegram\u0026logoColor=white\"\u003e\u003c/a\u003e\n  \u003c!-- License --\u003e\n  \u003ca href=\"./LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/Byaidu/PDFMathTranslate\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003ca href=\"https://trendshift.io/repositories/12424\" target=\"_blank\"\u003e\u003cimg src=\"https://trendshift.io/api/badge/repositories/12424\" alt=\"Byaidu%2FPDFMathTranslate | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/\u003e\u003c/a\u003e\n\n\u003c/div\u003e\n\nPDF scientific paper translation and bilingual comparison.\n\n- 📊 Preserve formulas, charts, table of contents, and annotations _([preview](#preview))_.\n- 🌐 Support [multiple languages](#language), and diverse [translation services](#services).\n- 🤖 Provides [commandline tool](#usage), [interactive user interface](#gui), and [Docker](#docker)\n\nFeel free to provide feedback in [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [Telegram Group](https://t.me/+Z9_SgnxmsmA5NzBl).\n\nFor details on how to contribute, please consult the [Contribution Guide](https://github.com/Byaidu/PDFMathTranslate/wiki/Contribution-Guide---%E8%B4%A1%E7%8C%AE%E6%8C%87%E5%8D%97).\n\n\u003ch2 id=\"updates\"\u003eUpdates\u003c/h2\u003e\n\n- [Mar. 3, 2025] Experimental support for the new backend [BabelDOC](https://github.com/funstory-ai/BabelDOC) WebUI added as an experimental option (by [@awwaawwa](https://github.com/awwaawwa))\n- [Feb. 22 2025] Better release CI and well-packaged windows-amd64 exe (by [@awwaawwa](https://github.com/awwaawwa))\n- [Dec. 24 2024] The translator now supports local models on [Xinference](https://github.com/xorbitsai/inference) _(by [@imClumsyPanda](https://github.com/imClumsyPanda))_\n- [Dec. 19 2024] Non-PDF/A documents are now supported using `-cp` _(by [@reycn](https://github.com/reycn))_\n- [Dec. 13 2024] Additional support for backend by _(by [@YadominJinta](https://github.com/YadominJinta))_\n- [Dec. 10 2024] The translator now supports OpenAI models on Azure _(by [@yidasanqian](https://github.com/yidasanqian))_\n\n\u003ch2 id=\"preview\"\u003ePreview\u003c/h2\u003e\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./docs/images/preview.gif\" width=\"80%\"/\u003e\n\u003c/div\u003e\n\n\u003ch2 id=\"demo\"\u003eOnline Service 🌟\u003c/h2\u003e\n\nYou can try our application out using either of the following demos:\n\n- [Public free service](https://pdf2zh.com/) online without installation _(recommended)_.\n- [Immersive Translate - BabelDOC](https://app.immersivetranslate.com/babel-doc/) 1000 free pages per month. _(recommended)_\n- [Demo hosted on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)\n- [Demo hosted on ModelScope](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) without installation.\n\nNote that the computing resources of the demo are limited, so please avoid abusing them.\n\n\u003ch2 id=\"install\"\u003eInstallation and Usage\u003c/h2\u003e\n\n### Methods\n\nFor different use cases, we provide distinct methods to use our program:\n\n\u003cdetails open\u003e\n  \u003csummary\u003e1. UV install\u003c/summary\u003e\n\n1. Python installed (3.10 \u003c= version \u003c= 3.12)\n2. Install our package:\n\n   ```bash\n   pip install uv\n   uv tool install --python 3.12 pdf2zh\n   ```\n\n3. Execute translation, files generated in [current working directory](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):\n\n   ```bash\n   pdf2zh document.pdf\n   ```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e2. Windows exe\u003c/summary\u003e\n\n1. Download pdf2zh-version-win64.zip from [release page](https://github.com/Byaidu/PDFMathTranslate/releases)\n\n2. Unzip and double-click `pdf2zh.exe` to run.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e3. Graphic user interface\u003c/summary\u003e\n1. Python installed (3.10 \u003c= version \u003c= 3.12)\n2. Install our package:\n\n```bash\npip install pdf2zh\n```\n\n3. Start using in browser:\n\n   ```bash\n   pdf2zh -i\n   ```\n\n4. If your browswer has not been started automatically, goto\n\n   ```bash\n   http://localhost:7860/\n   ```\n\n   \u003cimg src=\"./docs/images/gui.gif\" width=\"500\"/\u003e\n\nSee [documentation for GUI](./docs/README_GUI.md) for more details.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e4. Docker\u003c/summary\u003e\n\n1. Pull and run:\n\n   ```bash\n   docker pull byaidu/pdf2zh\n   docker run -d -p 7860:7860 byaidu/pdf2zh\n   ```\n\n2. Open in browser:\n\n   ```\n   http://localhost:7860/\n   ```\n\nFor docker deployment on cloud service:\n\n\u003cdiv\u003e\n\u003ca href=\"https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate\"\u003e\n  \u003cimg src=\"https://www.herokucdn.com/deploy/button.svg\" alt=\"Deploy\" height=\"26\"\u003e\u003c/a\u003e\n\u003ca href=\"https://render.com/deploy\"\u003e\n  \u003cimg src=\"https://render.com/images/deploy-to-render-button.svg\" alt=\"Deploy to Koyeb\" height=\"26\"\u003e\u003c/a\u003e\n\u003ca href=\"https://zeabur.com/templates/5FQIGX?referralCode=reycn\"\u003e\n  \u003cimg src=\"https://zeabur.com/button.svg\" alt=\"Deploy on Zeabur\" height=\"26\"\u003e\u003c/a\u003e\n\u003ca href=\"https://app.koyeb.com/deploy?type=git\u0026builder=buildpack\u0026repository=github.com/Byaidu/PDFMathTranslate\u0026branch=main\u0026name=pdf-math-translate\"\u003e\n  \u003cimg src=\"https://www.koyeb.com/static/images/deploy/button.svg\" alt=\"Deploy to Koyeb\" height=\"26\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e5. Zotero Plugin\u003c/summary\u003e\n\n\nSee [Zotero PDF2zh](https://github.com/guaguastandup/zotero-pdf2zh) for more details.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003e6. Commandline\u003c/summary\u003e\n\n1. Python installed (3.10 \u003c= version \u003c= 3.12)\n2. Install our package:\n\n   ```bash\n   pip install pdf2zh\n   ```\n\n3. Execute translation, files generated in [current working directory](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444):\n\n   ```bash\n   pdf2zh document.pdf\n   ```\n\n\u003c/details\u003e\n\n\u003e [!TIP]\n\u003e\n\u003e - If you're using Windows and cannot open the file after downloading, please install [vc_redist.x64.exe](https://aka.ms/vs/17/release/vc_redist.x64.exe) and try again.\n\u003e\n\u003e - If you cannot access Docker Hub, please try the image on [GitHub Container Registry](https://github.com/Byaidu/PDFMathTranslate/pkgs/container/pdfmathtranslate).\n\u003e ```bash\n\u003e docker pull ghcr.io/byaidu/pdfmathtranslate\n\u003e docker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate\n\u003e ```\n\n### Unable to install?\n\nThe present program needs an AI model(`wybxc/DocLayout-YOLO-DocStructBench-onnx`) before working and some users are not able to download due to network issues. If you have a problem with downloading this model, we provide a workaround using the following environment variable:\n\n```shell\nset HF_ENDPOINT=https://hf-mirror.com\n```\n\nFor PowerShell user:\n\n```shell\n$env:HF_ENDPOINT = https://hf-mirror.com\n```\n\nIf the solution does not work to you / you encountered other issues, please refer to [frequently asked questions](https://github.com/Byaidu/PDFMathTranslate/wiki#-faq--%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98).\n\n\u003ch2 id=\"usage\"\u003eAdvanced Options\u003c/h2\u003e\n\nExecute the translation command in the command line to generate the translated document `example-mono.pdf` and the bilingual document `example-dual.pdf` in the current working directory. Use Google as the default translation service. More support translation services can find [HERE](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services).\n\n\u003cimg src=\"./docs/images/cmd.explained.png\" width=\"580px\"  alt=\"cmd\"/\u003e\n\nIn the following table, we list all advanced options for reference:\n\n| Option         | Function                                                                                                      | Example                                        |\n| -------------- | ------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- |\n| files          | Local files                                                                                                   | `pdf2zh ~/local.pdf`                           |\n| links          | Online files                                                                                                  | `pdf2zh http://arxiv.org/paper.pdf`            |\n| `-i`           | [Enter GUI](#gui)                                                                                             | `pdf2zh -i`                                    |\n| `-p`           | [Partial document translation](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#partial) | `pdf2zh example.pdf -p 1`                      |\n| `-li`          | [Source language](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)            | `pdf2zh example.pdf -li en`                    |\n| `-lo`          | [Target language](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#languages)            | `pdf2zh example.pdf -lo zh`                    |\n| `-s`           | [Translation service](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services)         | `pdf2zh example.pdf -s deepl`                  |\n| `-t`           | [Multi-threads](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#threads)                | `pdf2zh example.pdf -t 1`                      |\n| `-o`           | Output dir                                                                                                    | `pdf2zh example.pdf -o output`                 |\n| `-f`, `-c`     | [Exceptions](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#exceptions)                | `pdf2zh example.pdf -f \"(MS.*)\"`               |\n| `-cp`          | Compatibility Mode                                                                                            | `pdf2zh example.pdf --compatible`              |\n| `--skip-subset-fonts` | [Skip font subset](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#font-subset)  | `pdf2zh example.pdf --skip-subset-fonts`       |\n| `--ignore-cache` | [Ignore translate cache](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cache)       | `pdf2zh example.pdf --ignore-cache`            |\n| `--share`      | Public link                                                                                                   | `pdf2zh -i --share`                            |\n| `--authorized` | [Authorization](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#auth)                   | `pdf2zh -i --authorized users.txt [auth.html]` |\n| `--prompt`     | [Custom Prompt](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#prompt)                 | `pdf2zh --prompt [prompt.txt]`                 |\n| `--onnx`       | [Use Custom DocLayout-YOLO ONNX model]                                                                        | `pdf2zh --onnx [onnx/model/path]`              |\n| `--serverport` | [Use Custom WebUI port]                                                                                       | `pdf2zh --serverport 7860`                     |\n| `--dir`        | [batch translate]                                                                                             | `pdf2zh --dir /path/to/translate/`             |\n| `--config`     | [configuration file](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig)             | `pdf2zh --config /path/to/config/config.json`  |\n| `--serverport` | [custom gradio server port]                                                                                   | `pdf2zh --serverport 7860`                     |\n|`--babeldoc`| Use Experimental backend [BabelDOC](https://funstory-ai.github.io/BabelDOC/) to translate |`pdf2zh --babeldoc` -s openai example.pdf|\n\nFor detailed explanations, please refer to our document about [Advanced Usage](./docs/ADVANCED.md) for a full list of each option.\n\n\u003ch2 id=\"downstream\"\u003eSecondary Development (APIs)\u003c/h2\u003e\n\nThe current pdf2zh API is temporarily deprecated. The API will be provided again after [pdf2zh 2.0](https://github.com/Byaidu/PDFMathTranslate/issues/586) is released. For users who need programmatic access, please use the `babeldoc.high_level.async_translate` function of [BabelDOC](https://github.com/funstory-ai/BabelDOC).\n\nThis API being temporarily deprecated means: the relevant code will not be removed for now, but no technical support will be provided, and no bug fixes will be made.\n\u003c!-- For downstream applications, please refer to our document about [API Details](./docs/APIS.md) for futher information about:\n\n- [Python API](./docs/APIS.md#api-python), how to use the program in other Python programs\n- [HTTP API](./docs/APIS.md#api-http), how to communicate with a server with the program installed --\u003e\n\n\u003ch2 id=\"todo\"\u003eTODOs\u003c/h2\u003e\n\n- [ ] Parse layout with DocLayNet based models, [PaddleX](https://github.com/PaddlePaddle/PaddleX/blob/17cc27ac3842e7880ca4aad92358d3ef8555429a/paddlex/repo_apis/PaddleDetection_api/object_det/official_categories.py#L81), [PaperMage](https://github.com/allenai/papermage/blob/9cd4bb48cbedab45d0f7a455711438f1632abebe/README.md?plain=1#L102), [SAM2](https://github.com/facebookresearch/sam2)\n\n- [ ] Fix page rotation, table of contents, format of lists\n\n- [ ] Fix pixel formula in old papers\n\n- [ ] Async retry except KeyboardInterrupt\n\n- [ ] Knuth–Plass algorithm for western languages\n\n- [ ] Support non-PDF/A files\n\n- [ ] Plugins of [Zotero](https://github.com/zotero/zotero) and [Obsidian](https://github.com/obsidianmd/obsidian-releases)\n\n\u003ch2 id=\"acknowledgement\"\u003eAcknowledgements\u003c/h2\u003e\n\n- [Immersive Translation](https://immersivetranslate.com) sponsors monthly Pro membership redemption codes for active contributors to this project, see details at: [CONTRIBUTOR_REWARD.md](https://github.com/funstory-ai/BabelDOC/blob/main/docs/CONTRIBUTOR_REWARD.md)\n\n- New backend: [BabelDOC](https://github.com/funstory-ai/BabelDOC)\n\n- Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)\n\n- Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)\n\n- Document extraction: [MinerU](https://github.com/opendatalab/MinerU)\n\n- Document Preview: [Gradio PDF](https://github.com/freddyaboulton/gradio-pdf)\n\n- Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)\n\n- Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)\n\n- Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)\n\n- Multilingual Font: [Go Noto Universal](https://github.com/satbyy/go-noto-universal)\n\n\u003ch2 id=\"contrib\"\u003eContributors\u003c/h2\u003e\n\n\u003ca href=\"https://github.com/Byaidu/PDFMathTranslate/graphs/contributors\"\u003e\n  \u003cimg src=\"https://opencollective.com/PDFMathTranslate/contributors.svg?width=890\u0026button=false\" /\u003e\n\u003c/a\u003e\n\n![Alt](https://repobeats.axiom.co/api/embed/dfa7583da5332a11468d686fbd29b92320a6a869.svg \"Repobeats analytics image\")\n\n\u003ch2 id=\"star_hist\"\u003eStar History\u003c/h2\u003e\n\n\u003ca href=\"https://star-history.com/#Byaidu/PDFMathTranslate\u0026Date\"\u003e\n \u003cpicture\u003e\n   \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate\u0026type=Date\u0026theme=dark\" /\u003e\n   \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate\u0026type=Date\" /\u003e\n   \u003cimg alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate\u0026type=Date\"/\u003e\n \u003c/picture\u003e\n\u003c/a\u003e\n","funding_links":[],"categories":["置顶","Python","Repos","A01_文本生成_文本对话","🛠️ 一、工具类项目","References"],"sub_categories":["9、效率工具集合","其他_文本生成_文本对话","📄 1.2 PDF 处理工具"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FByaidu%2FPDFMathTranslate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FByaidu%2FPDFMathTranslate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FByaidu%2FPDFMathTranslate/lists"}