{"id":41175103,"url":"https://github.com/ungerik/pdfzig","last_synced_at":"2026-01-22T19:52:13.149Z","repository":{"id":331640761,"uuid":"1127629092","full_name":"ungerik/pdfzig","owner":"ungerik","description":"A fast, cross-platform PDF utility tool written in Zig, powered by PDFium","archived":false,"fork":false,"pushed_at":"2026-01-10T20:08:29.000Z","size":1267,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-11T04:19:36.728Z","etag":null,"topics":["pdf","pdf-processing","pdf-rendering","pdfium","zig"],"latest_commit_sha":null,"homepage":"","language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ungerik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-04T09:26:53.000Z","updated_at":"2026-01-10T20:08:33.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ungerik/pdfzig","commit_stats":null,"previous_names":["ungerik/pdfzig"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ungerik/pdfzig","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ungerik%2Fpdfzig","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ungerik%2Fpdfzig/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ungerik%2Fpdfzig/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ungerik%2Fpdfzig/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ungerik","download_url":"https://codeload.github.com/ungerik/pdfzig/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ungerik%2Fpdfzig/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28669779,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T19:36:09.361Z","status":"ssl_error","status_checked_at":"2026-01-22T19:36:05.567Z","response_time":144,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdf","pdf-processing","pdf-rendering","pdfium","zig"],"created_at":"2026-01-22T19:52:12.900Z","updated_at":"2026-01-22T19:52:13.131Z","avatar_url":"https://github.com/ungerik.png","language":"Zig","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pdfzig\n\nA fast, cross-platform PDF utility tool written in Zig, powered by PDFium.\n\n## Features\n\n- **Render** PDF pages to PNG or JPEG images at any DPI\n- **Extract text** content from PDFs\n- **Extract images** embedded in PDF pages\n- **Extract attachments** embedded in PDFs with glob pattern filtering\n- **Visual diff** - compare two PDFs visually, pixel by pixel\n- **Display info** including metadata, page count, encryption status, PDF/A conformance, attachments, and PDF version\n- **Rotate** pages by 90, 180, or 270 degrees\n- **Mirror** pages horizontally or vertically\n- **Delete** pages from PDFs\n- **Add** new pages with optional image or text content\n- **Create** new PDFs from multiple sources (PDFs, images, text files)\n- **Attach** files as PDF attachments\n- **Detach** (remove) attachments from PDFs\n- Support for **password-protected** PDFs\n- **Multi-resolution output** - generate multiple image sizes in one pass\n- **Runtime PDFium linking** - download or link PDFium libraries dynamically\n\n## Installation\n\n### Requirements\n\n- Zig 0.15.1 or later\n\n### Build\n\n```bash\ngit clone https://github.com/ungerik/pdfzig.git\ncd pdfzig\nzig build\n```\n\n### Download PDFium\n\nAfter building, download the PDFium library for your platform:\n\n```bash\n# Download latest PDFium build\n./zig-out/bin/pdfzig download_pdfium\n\n# Or download a specific Chromium build version\n./zig-out/bin/pdfzig download_pdfium 7606\n```\n\nPDFium is automatically downloaded on first use if not already present. The library is installed next to the pdfzig executable with the naming pattern `libpdfium_v{BUILD}.dylib` (macOS), `libpdfium_v{BUILD}.so` (Linux), or `pdfium_v{BUILD}.dll` (Windows).\n\nDownloads are verified using SHA256 checksums from the GitHub release API to ensure authenticity and integrity.\n\nThe executable will be in `zig-out/bin/`.\n\n## Usage\n\n### Render PDF to Images\n\n```bash\n# Render all pages to PNG at 150 DPI (default)\npdfzig render document.pdf\n\n# Render to a specific directory\npdfzig render document.pdf ./output\n\n# Render at 300 DPI\npdfzig render -O 300:png:0:page_{num}.png document.pdf\n\n# Render specific pages\npdfzig render -p 1-5,10 document.pdf\n\n# Multi-resolution: full-size PNG + JPEG thumbnail\npdfzig render -O 300:png:0:{basename}_{num0}.png -O 72:jpeg:85:thumb_{num}.jpg document.pdf\n```\n\nOutput specification format: `DPI:FORMAT:QUALITY:TEMPLATE`\n- **DPI**: Resolution (e.g., 72, 150, 300)\n- **FORMAT**: `png` or `jpeg`/`jpg`\n- **QUALITY**: JPEG quality 1-100 (ignored for PNG, use 0)\n- **TEMPLATE**: Filename with variables `{num}`, `{num0}`, `{basename}`, `{ext}`\n\n### Extract Text\n\nExtract text content from PDF pages in plain text or structured JSON format. By default, pages are separated by a single newline. Add custom separators with template variables and escape sequences using `--page-separator`.\n\n```bash\n# Print text to stdout\npdfzig extract_text document.pdf\n\n# Save to file\npdfzig extract_text -o output.txt document.pdf\n\n# Extract from specific pages\npdfzig extract_text -p 1-10 document.pdf\n\n# Extract as JSON with formatting information\npdfzig extract_text --format json document.pdf\n\n# Save JSON to file\npdfzig extract_text -f json -o output.json document.pdf\n\n# Add page separator with page number\npdfzig extract_text --page-separator \"--- Page {{PAGE_NO}} ---\" document.pdf\n\n# Separator with escape sequences (multiple newlines)\npdfzig extract_text --page-separator '\\n\\n=== Page {{PAGE_NO}} ===\\n\\n' document.pdf\n```\n\n#### JSON Output Format\n\nWhen using `--format json`, the output contains structured text blocks with formatting information:\n\n```json\n{\n  \"pages\": [\n    {\n      \"page\": 1,\n      \"width\": 595.28,\n      \"height\": 841.89,\n      \"blocks\": [\n        {\n          \"text\": \"Hello World\",\n          \"bbox\": {\"left\": 72.0, \"top\": 750.0, \"right\": 150.0, \"bottom\": 738.0},\n          \"font\": \"Helvetica\",\n          \"size\": 12.0,\n          \"weight\": 400,\n          \"italic\": false,\n          \"color\": \"#000000\"\n        }\n      ]\n    }\n  ]\n}\n```\n\nEach text block contains:\n- **text**: The text content\n- **bbox**: Bounding box with `left`, `top`, `right`, `bottom` coordinates (in PDF points)\n- **font**: Font name (or `null` if unavailable)\n- **size**: Font size in points\n- **weight**: Font weight (400 = normal, 700 = bold, -1 = unavailable)\n- **italic**: Whether the text is italic\n- **color**: CSS-compatible hex color (`#rrggbb` or `#rrggbbaa` with alpha)\n\n### Extract Embedded Images\n\n```bash\n# Extract all images as PNG\npdfzig extract_images document.pdf\n\n# Extract to specific directory as JPEG\npdfzig extract_images -f jpeg -Q 90 document.pdf ./images\n\n# Extract from specific pages\npdfzig extract_images -p 1-5 document.pdf\n```\n\n### Extract Attachments\n\n```bash\n# Extract all attachments\npdfzig extract_attachments document.pdf\n\n# Extract only XML files using glob pattern\npdfzig extract_attachments document.pdf \"*.xml\"\n\n# Extract to specific directory\npdfzig extract_attachments document.pdf \"*.xml\" ./xml-output\n\n# List all attachments without extracting\npdfzig extract_attachments -l document.pdf\n\n# List only JSON files\npdfzig extract_attachments -l document.pdf \"*.json\"\n```\n\nPattern syntax: `*` matches any characters, `?` matches a single character\n\n### Visual Diff\n\nCompare two PDFs visually by rendering and comparing pixels:\n\n```bash\n# Compare two PDFs (exit code 0 = identical, 1 = different)\npdfzig visual_diff original.pdf modified.pdf\n\n# Compare at higher resolution (default: 150 DPI)\npdfzig visual_diff -r 300 doc1.pdf doc2.pdf\n\n# Generate diff images showing differences\npdfzig visual_diff -o ./diffs doc1.pdf doc2.pdf\n\n# Use RGB mode for per-channel color diffs\npdfzig visual_diff -o ./diffs --colors rgb doc1.pdf doc2.pdf\n\n# Use gray mode (average diff) instead of contrast mode\npdfzig visual_diff -o ./diffs --colors gray doc1.pdf doc2.pdf\n\n# Invert colors (white = identical, black = max difference)\npdfzig visual_diff -o ./diffs --invert doc1.pdf doc2.pdf\n\n# Compare encrypted PDFs\npdfzig visual_diff -P secret1 -P secret2 enc1.pdf enc2.pdf\n```\n\nWhen `-o` is specified, diff images are created showing pixel differences.\nOutput files are named `diff_page1.png`, `diff_page2.png`, etc.\n\nColor modes (`--colors`):\n- **contrast** (default): Grayscale where values are scaled so the maximum difference appears as white\n- **gray**: Grayscale where each pixel's brightness is the average RGB difference\n- **rgb**: Per-channel RGB differences shown as color values\n\nUse `--invert` to flip colors so identical pixels are white and differences are black.\n\n### Display PDF Information\n\n```bash\npdfzig info document.pdf\n```\n\nOutput example:\n```\nFile: document.pdf\nPages: 10\nPDF Version: 1.7\nEncrypted: No\n\nMetadata:\n  Title: My Document\n  Author: John Doe\n  Creator: LaTeX\n  Producer: pdfTeX-1.40.23\n  Creation Date: D:20240101120000+00'00'\n  PDF/A: PDF/A-1b\n\nAttachments: 2\n  invoice.xml [XML]\n  data.json\n\nXML files: 1 (use 'extract_attachments \"*.xml\"' to extract)\n```\n\nThe `PDF/A` field is automatically detected from XMP metadata and displays the conformance level (e.g., PDF/A-1b, PDF/A-2u, PDF/A-3a). This field is only shown for PDF/A-compliant documents.\n\nUse `--json` for machine-readable output with per-page dimensions:\n```bash\npdfzig info --json document.pdf\n```\n\n### Rotate Pages\n\n```bash\n# Rotate all pages 90 degrees clockwise\npdfzig rotate document.pdf 90\n\n# Rotate specific pages 180 degrees\npdfzig rotate -p 1,3,5 document.pdf 180\n\n# Rotate and save to a different file\npdfzig rotate -o rotated.pdf document.pdf 270\n\n# Use aliases: right (90°) and left (-90°)\npdfzig rotate document.pdf right\npdfzig rotate document.pdf left\n```\n\nSupported angles: `90`, `180`, `270` (clockwise), or `left`/`right`\n\n### Mirror Pages\n\n```bash\n# Mirror all pages horizontally (left-right flip)\npdfzig mirror document.pdf\n\n# Mirror vertically (up-down flip)\npdfzig mirror --updown document.pdf\n\n# Mirror specific pages\npdfzig mirror -p 1,3,5 document.pdf\n\n# Mirror both horizontally and vertically\npdfzig mirror --leftright --updown document.pdf\n\n# Save to a different file\npdfzig mirror -o mirrored.pdf document.pdf\n```\n\n### Delete Pages\n\n```bash\n# Delete page 5\npdfzig delete document.pdf 5\n\n# Delete pages 1, 3, and 5-10\npdfzig delete document.pdf 1,3,5-10\n\n# Delete and save to a different file\npdfzig delete -o trimmed.pdf document.pdf 1-3\n\n# Delete all pages (replaces with one empty page of same size as first page)\npdfzig delete document.pdf\n```\n\n### Add Pages\n\n```bash\n# Add an empty page at the end\npdfzig add document.pdf\n\n# Add an empty page at position 3\npdfzig add -p 3 document.pdf\n\n# Add a page with an image (scaled to fit)\npdfzig add document.pdf image.png\n\n# Add a page with text content\npdfzig add document.pdf notes.txt\n\n# Add a page with formatted text from JSON (same format as extract_text --format json)\npdfzig add document.pdf content.json\n\n# Specify page size using standard names\npdfzig add -s A4 document.pdf\npdfzig add -s Letter document.pdf\n\n# Use landscape orientation (append 'L')\npdfzig add -s A4L document.pdf\n\n# Specify size with units\npdfzig add -s 210x297mm document.pdf\npdfzig add -s 8.5x11in document.pdf\n```\n\nSupported page sizes: A0-A8, B0-B6, C4-C6, Letter, Legal, Tabloid, Ledger, Executive, Folio, Quarto, Statement\n\nSupported units: `mm`, `cm`, `in`/`inch`, `pt` (points, default)\n\n### Create PDF\n\nCreate a new PDF from multiple sources (PDFs, images, text files):\n\n```bash\n# Merge two PDFs\npdfzig create -o combined.pdf doc1.pdf doc2.pdf\n\n# Import only specific pages from a PDF\npdfzig create -o excerpt.pdf -p 1-5,10 document.pdf\n\n# Create from mixed sources: image cover + PDF content\npdfzig create -o book.pdf cover.png content.pdf\n\n# Insert blank pages using :blank\npdfzig create -o padded.pdf :blank document.pdf :blank\n\n# Combine everything: blank + image + PDF pages + text\npdfzig create -o report.pdf :blank logo.png -p 1-3 intro.pdf notes.txt\n\n# Specify page size for images and text (default: A4)\npdfzig create -s Letter -o output.pdf image.png notes.txt\n```\n\nSource types:\n- **PDF files**: Import pages (all pages or use `-p` for specific pages)\n- **Images**: PNG, JPEG, BMP - creates a page with the image scaled to fit\n- **Text files**: Creates a page with the text content\n- **JSON files**: Creates pages with formatted text blocks (same format as `extract_text --format json`)\n- **`:blank`**: Inserts a blank page\n\n#### Standard PDF Fonts\n\nWhen using JSON files for text input, fonts are mapped to the PDF Standard Base 14 Fonts which are guaranteed to be available in all PDF viewers:\n\n| Font Family  | Variants                               |\n|--------------|----------------------------------------|\n| Helvetica    | Regular, Bold, Oblique, Bold-Oblique   |\n| Courier      | Regular, Bold, Oblique, Bold-Oblique   |\n| Times        | Roman, Bold, Italic, Bold-Italic       |\n| Symbol       | Regular                                |\n| ZapfDingbats | Regular                                |\n\nFont names in JSON are matched using these rules:\n- Names containing \"Courier\" or \"Mono\" → Courier\n- Names containing \"Times\" or \"Serif\" → Times\n- All other names → Helvetica (default sans-serif)\n- Bold/Italic variants are selected based on `weight` (≥600) and `italic` fields\n\n### Attach Files\n\n```bash\n# Attach a file to the PDF\npdfzig attach document.pdf invoice.xml\n\n# Attach multiple files\npdfzig attach document.pdf file1.json file2.xml\n\n# Attach files matching a glob pattern\npdfzig attach -g \"*.xml\" document.pdf\n\n# Save to a different file\npdfzig attach -o with_attachments.pdf document.pdf data.json\n```\n\n### Detach (Remove) Attachments\n\n```bash\n# Remove attachment by index\npdfzig detach -i 0 document.pdf\n\n# Remove attachments matching a glob pattern\npdfzig detach -g \"*.xml\" document.pdf\n\n# Save to a different file\npdfzig detach -o clean.pdf -g \"*.tmp\" document.pdf\n```\n\n### Use a Specific PDFium Library\n\nThe `--link` global option loads PDFium from a specific path instead of the default location:\n\n```bash\n# Use a specific PDFium library for a command\npdfzig --link /path/to/libpdfium.dylib info document.pdf\n\n# Works with any command\npdfzig --link /usr/local/lib/libpdfium.dylib render document.pdf\n```\n\nThe version number is automatically parsed from filenames matching the pattern `libpdfium_v{VERSION}.dylib` (or `.so`/`.dll` on other platforms).\n\n### Password-Protected PDFs\n\nAll commands support the `-P` flag for encrypted PDFs:\n\n```bash\npdfzig info -P mypassword encrypted.pdf\npdfzig render -P mypassword encrypted.pdf\npdfzig extract_text -P mypassword encrypted.pdf\n```\n\n### Page Selection\n\nMany commands support the `-p` option to select specific pages. If not specified, the command operates on all pages.\n\nPage selection syntax:\n- Single page: `-p 5`\n- Multiple pages: `-p 1,3,5`\n- Page range: `-p 1-10`\n- Combined: `-p 1-5,8,10-12`\n\nCommands supporting `-p`: `render`, `extract_text`, `extract_images`, `rotate`, `mirror`, `create`\n\n## Supported Platforms\n\n| Platform | Architecture       |\n|----------|--------------------|\n| macOS    | arm64, x86_64      |\n| Linux    | x86_64, arm64, arm |\n| Windows  | x86_64, x86, arm64 |\n\n## Dependencies\n\n- [PDFium](https://pdfium.googlesource.com/pdfium/) - PDF rendering engine (dynamically loaded at runtime from [pdfium-binaries](https://github.com/bblanchon/pdfium-binaries))\n- [zigimg](https://github.com/zigimg/zigimg) - PNG encoding\n- [zstbi](https://github.com/zig-gamedev/zstbi) - JPEG encoding\n\n## Development\n\n```bash\n# Build\nzig build\n\n# Run tests\nzig build test --summary all\n\n# Run directly\nzig build run -- info document.pdf\n\n# Check source code formatting\nzig build fmt\n\n# Fix source code formatting\nzig build fmt-fix\n\n# Remove build artifacts and caches (zig-out/, .zig-cache/, test-cache/)\nzig build clean\n\n# Build for all supported platforms (outputs to zig-out/\u003ctarget-triple\u003e/)\nzig build all\n```\n\n### Cross-Compilation Targets\n\nThe `zig build all` command builds for all supported platforms:\n\n| Platform | Architecture | Output Directory               |\n|----------|--------------|--------------------------------|\n| macOS    | x86_64       | `zig-out/x86_64-macos-none/`   |\n| macOS    | arm64        | `zig-out/aarch64-macos-none/`  |\n| Linux    | x86_64       | `zig-out/x86_64-linux-gnu/`    |\n| Linux    | arm64        | `zig-out/aarch64-linux-gnu/`   |\n| Linux    | arm          | `zig-out/arm-linux-gnueabihf/` |\n| Windows  | x86_64       | `zig-out/x86_64-windows-gnu/`  |\n| Windows  | x86          | `zig-out/x86-windows-gnu/`     |\n| Windows  | arm64        | `zig-out/aarch64-windows-gnu/` |\n\n### Build Options\n\n| Option                    | Description                                                    |\n|---------------------------|----------------------------------------------------------------|\n| `-Ddownload-pdfium`       | Download PDFium library for target platform(s)                 |\n| `-Doptimize=ReleaseFast`  | Build with optimizations for speed                             |\n| `-Doptimize=ReleaseSmall` | Build with optimizations for size                              |\n| `-Doptimize=ReleaseSafe`  | Build with optimizations and runtime safety checks             |\n| `-Dtarget=\u003ctriple\u003e`       | Cross-compile for a specific target (e.g., `x86_64-linux-gnu`) |\n\nExamples:\n\n```bash\n# Build with PDFium library included\nzig build -Ddownload-pdfium\n\n# Build optimized release with PDFium\nzig build -Doptimize=ReleaseFast -Ddownload-pdfium\n\n# Build for all platforms with PDFium libraries included\nzig build all -Ddownload-pdfium\n\n# Cross-compile for Linux with PDFium\nzig build -Dtarget=x86_64-linux-gnu -Ddownload-pdfium\n```\n\n### Testing\n\nRun the test suite:\n\n```bash\n# Run all tests except network-dependent tests\nzig build test --summary all\n\n# Run all tests including network-dependent tests (downloads test PDFs)\nRUN_NETWORK_TESTS=1 zig build test --summary all\n```\n\n#### Test Types\n\n**Unit Tests** (always run):\n- CLI argument parsing\n- Color parsing\n- Font mapping\n- PDF/A conformance parsing\n- Page range parsing\n\n**PDFium Tests** (always run):\n- Golden file tests (render, rotate, mirror operations)\n- PDF roundtrip tests (create PDF → extract text)\n- Requires: PDFium library (auto-downloads if missing) + local test files in `test-files/input/`\n\n**Network Tests** (require `RUN_NETWORK_TESTS=1`):\n- Integration tests using sample PDFs from [py-pdf/sample-files](https://github.com/py-pdf/sample-files)\n- Integration tests using ZUGFeRD invoices from [ZUGFeRD/corpus](https://github.com/ZUGFeRD/corpus)\n- Requires: PDFium library + network access\n- Test PDFs are cached in `test-cache/` directory (gitignored)\n\n#### Golden File Testing\n\nVisual regression tests compare PDF operations against reference \"golden files\":\n\n```bash\n# Generate/regenerate golden files\nzig build generate-golden-files\n\n# Delete and regenerate all golden files\nzig build generate-golden-files -Dclean\n\n# Run tests (golden file tests always run)\nzig build test\n```\n\nGolden files are stored in `test-files/expected/` and checked into git. Tests use pixel-level PNG comparison (not byte-level) to handle minor rendering variations across platforms.\n\n## License\n\nMIT License - see [LICENSE](LICENSE)\n\n### Third-Party Licenses\n\nThis software uses PDFium (BSD-3-Clause/Apache-2.0), zigimg (MIT), and stb libraries (MIT/Public Domain). See [THIRD_PARTY_NOTICES.md](THIRD_PARTY_NOTICES.md) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fungerik%2Fpdfzig","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fungerik%2Fpdfzig","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fungerik%2Fpdfzig/lists"}