Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nneonneo/sha1collider
Build two PDFs that have different content but identical SHA1 sums.
https://github.com/nneonneo/sha1collider
Last synced: about 2 months ago
JSON representation
Build two PDFs that have different content but identical SHA1 sums.
- Host: GitHub
- URL: https://github.com/nneonneo/sha1collider
- Owner: nneonneo
- Created: 2017-02-25T03:42:35.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2021-02-10T02:27:17.000Z (over 3 years ago)
- Last Synced: 2024-04-27T23:59:31.178Z (5 months ago)
- Language: Python
- Size: 4.19 MB
- Stars: 405
- Watchers: 19
- Forks: 70
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## About
Generate two PDFs with different contents but identical SHA1 hashes.
PDFs are rendered into JPGs and merged into the output file. They must have the same page size and page count.
Requires ghostscript, turbojpeg, PIL, and Python 3.
Uses the "shattered" PDF prologue from shattered.io - credit to Marc Stevens et al. for the collision.
Similar to, but more flexible (supports more than one page, arbitrary-sized inputs, etc.) than the collision generator from http://alf.nu/SHA1.
## Usage
Just run `python3 collide.py PDF1.pdf PDF2.pdf`, and it will generate `out-PDF1.pdf` and `out-PDF2.pdf`. These will contain the same content as the original input PDFs, but will have the same SHA1 hash. If the resulting PDFs don't work for you (e.g. they look corrupt, images have artifacts, etc.), try `--progressive` mode.
## Remarks
There are two encoding modes: a more flexible "restart interval" mode and a more compatible "progressive" mode, switched by way of `--progressive`.
Restart intervals allow the image data to be reliably broken up into small chunks. However, some PDF renderers, such as my version of GhostScript, cannot parse the resulting JPEG correctly (as it has comments preceding the restart markers).
Progressive mode works with many smaller PDFs (at lower resolution, for example), but breaks down with larger images. However, it produces PDFs that are broadly compatible because it does not involve bending the JPEG spec. This is the mode used by Google+CWI in generating their own PoC PDF pair.