https://github.com/nymann/pdf-scrub

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/nymann/pdf-scrub
Owner: nymann
Created: 2022-08-27T17:36:38.000Z (almost 4 years ago)
Default Branch: master
Last Pushed: 2022-08-28T14:17:38.000Z (almost 4 years ago)
Last Synced: 2025-03-28T21:06:49.878Z (about 1 year ago)
Language: Python
Size: 11.7 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# PDF Scrub

_Scrubs encrypted compressed PDF files for text watermarks and metadata._

1. Decrypts the PDF if it's encrypted
2. Uncompresses the PDF
3. Removes metadata (Xpacket)
4. Tries to naively remove text based watermarks by matching objects which number of occurrences, is the same as the PDF page count. If multiple objects match, produce a pdf for each.
5. Optionally compresses the PDF again if `--no-compress` is not given as a command line argument.

## Usage

```sh
$ pdf_scrub --help
Usage: pdf_scrub [OPTIONS] FILES...

Arguments:
FILES... [required]

Options:
--compress / --no-compress Compress the final pdf to reduce file size greatly [default: compress]
```

## Dependencies

Requires `qpdf` and `pdftk`.

## Development

For help getting started developing check [DEVELOPMENT.md](DEVELOPMENT.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nymann/pdf-scrub

Awesome Lists containing this project

README