Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/JoshData/pdf-diff
A PDF comparison utility in Python.
https://github.com/JoshData/pdf-diff
Last synced: about 2 months ago
JSON representation
A PDF comparison utility in Python.
- Host: GitHub
- URL: https://github.com/JoshData/pdf-diff
- Owner: JoshData
- License: cc0-1.0
- Created: 2014-06-09T23:36:52.000Z (over 10 years ago)
- Default Branch: primary
- Last Pushed: 2024-06-13T09:10:50.000Z (6 months ago)
- Last Synced: 2024-10-29T15:49:10.343Z (about 2 months ago)
- Language: Python
- Homepage:
- Size: 261 KB
- Stars: 450
- Watchers: 13
- Forks: 66
- Open Issues: 24
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-pdf - pdf-diff
README
# pdf-diff
Finds differences between two PDF documents:
1. Compares the text layers of two PDF documents and outputs the bounding boxes of changed text in JSON.
2. Rasterizes the changed pages in the PDFs to a PNG and draws red outlines around changed text.![Example Image Output](example.png)
The script is written in Python 3, and it relies on the `pdftotext` program.
## Requirements
libxml2 >= 2.7.0, libxslt >= 1.1.23, poppler
## Requirements installation for Ubuntu:
sudo apt-get install python3-lxml poppler-utils
## Requirements installation for OS X:
brew install libxml2 libxslt poppler
## InstallationFrom PyPI:
pip install pdf-diff
From source:
sudo python3 setup.py install
## RunningTurn two PDFs into one large PNG image showing the differences:
pdf-diff before.pdf after.pdf > comparison_output.png
## Maintainer Notes
To deploy:
python3 -m pip install --user --upgrade setuptools wheel twine
python3 setup.py sdist bdist_wheel
python3 -m twine upload dist/*