{"id":50457865,"url":"https://github.com/withmargin/pdf-translate","last_synced_at":"2026-06-01T03:08:05.443Z","repository":{"id":358723283,"uuid":"1242798365","full_name":"withmargin/pdf-translate","owner":"withmargin","description":"Translate PDF documents using LLMs. Preserves original layout.","archived":false,"fork":false,"pushed_at":"2026-05-18T21:35:02.000Z","size":5952,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-18T21:42:44.352Z","etag":null,"topics":["claude","cli","gemini","i18n","llm","localization","openai","pdf","pdf-translate","rust","translation","typescript"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/withmargin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-18T19:08:59.000Z","updated_at":"2026-05-18T21:35:04.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/withmargin/pdf-translate","commit_stats":null,"previous_names":["withmargin/pdf-translate"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/withmargin/pdf-translate","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/withmargin%2Fpdf-translate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/withmargin%2Fpdf-translate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/withmargin%2Fpdf-translate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/withmargin%2Fpdf-translate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/withmargin","download_url":"https://codeload.github.com/withmargin/pdf-translate/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/withmargin%2Fpdf-translate/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33757796,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["claude","cli","gemini","i18n","llm","localization","openai","pdf","pdf-translate","rust","translation","typescript"],"created_at":"2026-06-01T03:08:03.892Z","updated_at":"2026-06-01T03:08:05.428Z","avatar_url":"https://github.com/withmargin.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pdf-translate\n\n[![npm version](https://img.shields.io/npm/v/pdf-translate.svg)](https://www.npmjs.com/package/pdf-translate)\n[![license](https://img.shields.io/github/license/withmargin/pdf-translate)](https://github.com/withmargin/pdf-translate/blob/main/LICENSE)\n[![GitHub stars](https://img.shields.io/github/stars/withmargin/pdf-translate)](https://github.com/withmargin/pdf-translate)\n\nTranslate PDF documents using LLMs. Supports OpenAI, Claude, Gemini, and any OpenAI-compatible API.\n\nPreserves original PDF layout — backgrounds, images, shapes, and logos stay intact. Only text is replaced with translations.\n\n## Install\n\n```bash\nnpm install -g pdf-translate\n```\n\nOr run directly without installing:\n\n```bash\nnpx pdf-translate input.pdf -l zh-TW\n```\n\n## Usage\n\n```bash\n# Translate to Traditional Chinese (default)\npdf-translate input.pdf -l zh-TW\n\n# Translate to Japanese using Claude\npdf-translate input.pdf -l ja --provider claude\n\n# Translate using Gemini\npdf-translate input.pdf -l ko --provider gemini --model gemini-2.5-flash\n\n# Use a local model (Ollama, vLLM, etc.)\npdf-translate input.pdf -l zh-TW --base-url http://localhost:11434/v1 --model qwen3\n\n# Translate specific pages\npdf-translate input.pdf -l zh-TW --pages 0-4\n\n# Specify output path\npdf-translate input.pdf -l de --output translated.pdf\n\n# List available models from a provider\npdf-translate models --provider claude\n```\n\n## Providers\n\n| Provider | Flag | Environment Variable |\n| -------- | ---- | ------------------- |\n| OpenAI | `--provider openai` (default) | `OPENAI_API_KEY` |\n| Claude | `--provider claude` | `ANTHROPIC_API_KEY` |\n| Gemini | `--provider gemini` | `GEMINI_API_KEY` |\n| Custom | `--base-url \u003curl\u003e` | `OPENAI_API_KEY` (optional) |\n\n## How it works\n\n1. **Extract** — pdf_oxide reads text spans with coordinates, fonts, and colors\n2. **Translate** — LLM translates each page independently (page-by-page to prevent context bleeding)\n3. **Replace** — Content stream manipulation: strip original text operators, inject translated text with CJK font embedding\n4. **Output** — Modified PDF with original layout preserved\n\n```\n┌──────────────────────────────────────────────┐\n│  Node.js CLI (TypeScript)                    │\n│  - Command-line interface                    │\n│  - LLM API calls (openai SDK)               │\n│  - Per-page translation with batching        │\n├──────────────────────────────────────────────┤\n│  Rust Core (pdf_oxide + lopdf)               │\n│  - PDF text extraction with coordinates      │\n│  - Content stream manipulation               │\n│  - CJK font embedding (Noto Sans CJK TC)    │\n│  - Mixed CJK/Latin font rendering            │\n└──────────────────────────────────────────────┘\n```\n\n## Known limitations (v0.1.0)\n\n- **Multi-span lines**: Text split across multiple PDF spans on the same line may show gaps after translation. Paragraph grouping is planned for v0.2.0.\n- **No text reflow**: Translated text longer than the original may overflow its bounding box.\n- **CJK font size**: The embedded Noto Sans CJK TC font adds ~16MB to the output file. Font subsetting is planned.\n- **First CJK run**: Downloads the CJK font (~16MB) on first use. Cached after that.\n\n## Development\n\n### Prerequisites\n\n- [Rust](https://rustup.rs/) (stable)\n- [Node.js](https://nodejs.org/) \u003e= 22\n- [pnpm](https://pnpm.io/) \u003e= 10\n\n### Setup\n\n```bash\n# Build Rust core\ncargo build --release --manifest-path crates/core/Cargo.toml\n\n# Install Node.js dependencies\npnpm install\n\n# Build CLI\npnpm --filter pdf-translate build\n\n# Run in development\npnpm --filter pdf-translate dev -- input.pdf -l zh-TW\n\n# Run tests\ncargo test --manifest-path crates/core/Cargo.toml --lib\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwithmargin%2Fpdf-translate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwithmargin%2Fpdf-translate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwithmargin%2Fpdf-translate/lists"}