{"id":37030053,"url":"https://github.com/red6/pdfcompare","last_synced_at":"2026-01-14T03:38:46.878Z","repository":{"id":38206470,"uuid":"74676206","full_name":"red6/pdfcompare","owner":"red6","description":"A simple Java library to compare two PDF files","archived":false,"fork":false,"pushed_at":"2025-11-08T13:30:12.000Z","size":976,"stargazers_count":253,"open_issues_count":8,"forks_count":70,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-11-08T15:15:44.681Z","etag":null,"topics":["compare","pdf","pdf-files","pdfbox"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/red6.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-11-24T13:34:39.000Z","updated_at":"2025-11-08T13:30:16.000Z","dependencies_parsed_at":"2023-11-30T00:26:46.413Z","dependency_job_id":"d7fa51e6-c99c-4f6e-ba5d-bb990e88276a","html_url":"https://github.com/red6/pdfcompare","commit_stats":null,"previous_names":[],"tags_count":74,"template":false,"template_full_name":null,"purl":"pkg:github/red6/pdfcompare","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/red6%2Fpdfcompare","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/red6%2Fpdfcompare/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/red6%2Fpdfcompare/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/red6%2Fpdfcompare/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/red6","download_url":"https://codeload.github.com/red6/pdfcompare/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/red6%2Fpdfcompare/sbom","scorecard":{"id":767399,"data":{"date":"2025-08-11","repo":{"name":"github.com/red6/pdfcompare","commit":"63e56d5f4f88118eb1db9757e3500da3970d134b"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.6,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":1,"reason":"Found 4/26 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 8 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-23T01:13:13.391Z","repository_id":38206470,"created_at":"2025-08-23T01:13:13.391Z","updated_at":"2025-08-23T01:13:13.391Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28408850,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T01:52:23.358Z","status":"online","status_checked_at":"2026-01-14T02:00:06.678Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compare","pdf","pdf-files","pdfbox"],"created_at":"2026-01-14T03:38:46.148Z","updated_at":"2026-01-14T03:38:46.868Z","avatar_url":"https://github.com/red6.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PdfCompare  [![Build Status](https://travis-ci.org/red6/pdfcompare.svg?branch=master)](https://travis-ci.org/red6/pdfcompare) [![Maven Central Version](https://img.shields.io/maven-central/v/de.redsix/pdfcompare.svg)](http://search.maven.org/#search|gav|1|g:\"de.redsix\"%20AND%20a:\"pdfcompare\")\nA simple Java library to compare two PDF files.\nFiles are rendered and compared pixel by pixel. There is no text comparison.\n\n### Usage with Maven\n\nJust include it as a dependency. Please check for the most current version available:\n\n```xml\n\u003cdependencies\u003e\n  \u003cdependency\u003e\n    \u003cgroupId\u003ede.redsix\u003c/groupId\u003e\n    \u003cartifactId\u003epdfcompare\u003c/artifactId\u003e\n    \u003cversion\u003e...\u003c/version\u003e \u003c!-- see current version in the maven central tag above --\u003e\n  \u003c/dependency\u003e\n\u003c/dependencies\u003e\n```\n\n### Simple Usage via UI or Commandline\n\nThere is a simple interactive UI, when you start the jar file \nwithout any additional arguments (which starts the class de.redsix.pdfcompare.Main).\nIt allows you to choose files to compare and also to mark areas to ignore and write those to an ignore-file.\n\nNext to the UI you can provide an expected and actual \nfile and additional parameter via a CLI. To get a help for the CLI use the -h or --help option-.\n```\nusage: java -jar pdfcompare-x.x.x-full.jar [EXPECTED] [ACTUAL]\n -h,--help              Displays this text and exit\n ...\n```\n\n### Usage as a library\n\nBut the focus of PdfCompare is on embedded usage as a library.\n\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\").compare().writeTo(\"diffOutput\");\n```\nThis will produce an output PDF which may include markings for differences found.\nPdfCompare renders a page from the expected.pdf and the same page from the actual.pdf\nto a bitmap image and compares these two images pixel by pixel.\nPixels that are equal are faded a bit. Pixels that differ are marked in red and green.\nGreen for pixels that where in the expected.pdf, but are not present in the actual.pdf.\nRed for pixels that are present in the actual.pdf, but were not in the expected.pdf.\nAnd there are markings at the edge of the paper in magenta to find areas that differ quickly.\nIgnored Areas are marked with a yellow background.\nPages that were expected, but did not come are marked with a red border.\nPages that appear, but were not expected are marked with a green border.\n\nThe compare-method returns a CompareResult, which can be queried:\n\n```java\nfinal CompareResult result = new PdfComparator(\"expected.pdf\", \"actual.pdf\").compare();\nif (result.isNotEqual()) {\n    System.out.println(\"Differences found!\");\n}\nif (result.isEqual()) {\n    System.out.println(\"No Differences found!\");\n}\nif (result.hasDifferenceInExclusion()) {\n    System.out.println(\"Differences in excluded areas found!\");\n}\nresult.getDifferences(); // returns page areas, where differences were found\n```\nFor convenience, writeTo also returns the equals status:\n```java\nboolean isEquals = new PdfComparator(\"expected.pdf\", \"actual.pdf\").compare().writeTo(\"diffOutput\");\nif (!isEquals) {\n    System.out.println(\"Differences found!\");\n}\n```\nThe compare method can be called with filenames as Strings, Files, Paths or InputStreams.\n\n### Exclusions\n\nIt is also possible to define rectangular areas that are ignored during comparison. For that, a file needs to be created, which defines areas to ignore.\nThe file format is JSON (or actually a superset called [HOCON](https://github.com/typesafehub/config/blob/master/HOCON.md)) and has the following form:\n```javascript\nexclusions: [\n    {\n        page: 2\n        x1: 300 // entries without a unit are in pixels. Pdfs are rendered by default at 300DPI\n        y1: 1000\n        x2: 550\n        y2: 1300\n    },\n    {\n        // page is optional. When not given, the exclusion applies to all pages.\n        x1: 130.5mm // entries can also be given in units of cm, mm or pt (DTP-Point defined as 1/72 Inches)\n        y1: 3.3cm\n        x2: 190mm\n        y2: 3.7cm\n    },\n    {\n        page: 7\n        // coordinates are optional. When not given, the whole page is excluded.\n    }\n]\n```\n\nWhen the provided exclusion file is not found, it is ignored and the compare is done without the exclusions.\n\nExclusions are provided in the code as follows:\n\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\").withIgnore(\"ignore.conf\").compare();\n```\n\nAlternatively an Exclusion can be added via the API as follows:\n\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\")\n\t.withIgnore(new PageArea(1, 230, 350, 450, 420))\n\t.withIgnore(new PageArea(2))\n\t.compare();\n```\n### Encrypted PDF files\n\nWhen you want to compare password protected PDF files, you can give the password to the Comparator through the withExpectedPassword(String password) or withActualPassword(String password) methods respectively.\n\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\")\n    .withExpectedPassword(\"somePwd\")\n    .withActualPassword(\"anotherPwd\")\n    .compare();\n```\n\n### Configuring PdfCompare\n\nPdfCompare can be configured with a config file. The default config file is called \"application.conf\" and it\nmust be located in the root of the classpath.\n\nPdfCompare uses Lightbend Config (previously called TypeSafe Config) to read its configuration\nfiles. If you want to specify another configuration file, you can find out more about that here:\nhttps://github.com/lightbend/config#standard-behavior. In particular you can specify a\nreplacement config file with the -Dconfig.file=path/to/file command line argument.  \n\nAlternatively you can specify parameters either through a system environment variables or as a\nJvm parameter with -DvariableName=\u003cvalue\u003e  \n\nAnother way to specify a different config location programmatically is to create a\nnew ConfigFileEnvironment(...) and pass it to PdfCompare.withEnvironment(...).\n \n### Configuring PdfCompare though an API\n\nAll the settings, that can be changed through the application.conf file can also be changed programmatically through the API.\nTo do so you can use the following code:\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\")\n\t.withEnvironment(new SimpleEnvironment()\n        .setActualColor(Color.green)\n        .setExpectedColor(Color.blue))\n\t.compare();\n```\nThe SimpleEnvironment delegates all settings, that were not assigned, to the default Environment.\n\n#### Configuration options\n\nThrough the environment you can configure the memory settings (see above) and the following settings:\n\n- DPI=300\n\n    Sets the DPI that Pdf pages are rendered with. Default is 300.\n    \n- expectedColor=00B400 (GREEN)\n\n    The expected color is the color that is used for pixels that were expected, but are not there.\n    The colors are specified in HTML-Stlye format (without a leading '#'):\n    The first two characters define the red-portion of the color in hexadecimal. The next two characters define the green-portion\n    of the color. The last two characters define the blue-portion of the color to use.\n    \n- actualColor=D20000 (RED)\n\n    The actual color is the color that is used for pixels that are there, but were not expected.\n    The colors are specified in HTML-Stlye format (without a leading '#'):\n    The first two characters define the red-portion of the color in hexadecimal. The next two characters define the green-portion\n    of the color. The last two characters define the blue-portion of the color to use.\n\n- tempDir=System.property(\"java.io.tmpdir\")\n\n    Sets the directory where to write temporary files. Defaults to the java default for java.io.tmpdir, which usually determines a\n    system specific default, like /tmp on most unix systems.\n\n- allowedDifferenceInPercentPerPage=0.2\n\n    Percent of pixels that may differ per page. Default is 0.\n    If for some reason your rendering is a little off or you allow for some error margin,\n    you can configure a percentage of pixels that are ignored during comparison.\n    That way a difference is only reported, when more than the given percentage\n    of pixels differ. The percentage is calculated per page. Not that the differences\n    are still marked in the output file, when you addEqualPagesToResult.\n\n- parallelProcessing=true\n\n    When set to false, disables all parallel processing and process everything in a single thread.\n\n- addEqualPagesToResult=true\n\n    When set to false, only pages with differences are added to the result and this the resulting difference PDF document.\n    \n- failOnMissingIgnoreFile=false\n\n    When set to true, a missing ignore file leads to an exception. Otherwise it is ignored and only an info level log messages is written.\n\n### Different CompareResult Implementations\n\nThere are a few different Implementations of CompareResults with different characteristics.\nThe can be used to control certain aspects of the system behaviour, in particular memory consumption.\n\n#### Internals about memory consumption\n\nIt is good to know a few internals, when using the PdfCompare.\nHere is in a nutshell, what PdfCompare does, when it compares two PDFs.\n\nPdfCompare uses the Apache PdfBox Library to read and write Pdfs.\n\n- The Two Pdfs to compare are opened with PdfBox.\n- A page from each Pdf is read and rendered into a BufferedImage by default at 300dpi.\n- A new empty BufferedImage is created to take the result of the comparison. It has the maximum size of the expected and the actual image.\n- When the comparison is finished, the new BufferedImage, which holds the result of the comparison, is kept in memory in a CompareResult object. Holding on to the CompareResult means, that the images are also kept in memory. If memory consumption is a problem, a CompareResultWithPageOverflow or a CompareResultWithMemoryOverflow can be used. Those classes store images to a temporary folder on disk, when certain thresholds are reached.\n- After all pages are compared, a new Pdf is created and the images are written page by page into the new Pdf.\n\nSo comparing large Pdfs can use up a lot of memory.\nI didn't yet find a way to write the difference Pdf page by page incrementally with PdfBox, but there are some workarounds.\n\n#### CompareResults with Overflow\n\nThere are currently two different CompareResults, that have different strategies for swapping pages to disk and thereby limiting memory consumption.\n- CompareResultWithPageOverflow - stores a bunch of pages into a partial Pdf and merges the resulting Pdfs in the end. The default is to swap every 10 pages, which is a good balance between memory usage and performance.\n- CompareResultWithMemoryOverflow - tries to keep as many images in memory as possible and swaps, when a critical amount of memory is consumed by the JVM. As a default, pages are swapped, when 70% of the maximum available heap is filled.\n\nA different CompareResult implementation can be used as follows:\n\n```java\nnew PdfComparator(\"expected.pdf\", \"actual.pdf\", new CompareResultWithPageOverflow()).compare();\n```\n\nAlso there are some internal settings for memory limits, that can be changed.\nJust add a file called \"application.conf\" to the root of the classpath. This file can have some or all of the following settings to overwrite the defaults given here:\n\n- imageCacheSizeCount=30\n\n    How many images are cached by PdfBox\n- maxImageSizeInCache=100000\n\n    A rough maximum size of images that are cached, to prevent very big images from being cached\n- mergeCacheSizeMB=100\n\n    When Pdfs are partially written and later merged, this is the memory cache that is configured for the PdfBox instance that does the merge.\n- swapCacheSizeMB=100\n\n    When Pdfs are partially written, this is the memory cache that is configured for the PdfBox instance that does the partial writes.\n- documentCacheSizeMB=200\n\n    This is the cache size configured for the PdfBox instance, that loads the documents that are compared.\n- parallelProcessing=true\n\n    When set to false, disables all parallel processing and process everything in a single thread.\n- overallTimeoutInMinutes=15\n\n    Set the overall timeout. This is a safety measure to detect possible deadlocks. Complex comparisons might take longer, so this value might have to be increased.\n- executorTimeoutInSeconds=60\n\n  Sets the timeout to wait for the executors to finish after the overallTimeout was reached. It's unlikely that you ever need to change this.\n\nSo in this default configuration, PdfBox should use up to 400MB of Ram for it's caches, before swapping to disk.\nI have good experience with granting a 2GB heap space to the JVM.\n\n### Acknowledgements\n\nBig thanks to Chethan Rao \u003cmeetchethan@gmail.com\u003e for helping me diagnose out of memory problems and providing\nthe idea of partial writes and merging of the generated PDFs.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fred6%2Fpdfcompare","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fred6%2Fpdfcompare","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fred6%2Fpdfcompare/lists"}