https://github.com/laguileracl/pdf-ultra-compressor
Command-line, quality-first PDF optimizer. Drop PDFs into input/, get optimized results in output/. Ghostscript + qpdf with optional PSNR quality gate and a never-worse guarantee.
https://github.com/laguileracl/pdf-ultra-compressor
cli compression ghostscript linux macos optimizer pdf psnr qpdf
Last synced: about 2 months ago
JSON representation
Command-line, quality-first PDF optimizer. Drop PDFs into input/, get optimized results in output/. Ghostscript + qpdf with optional PSNR quality gate and a never-worse guarantee.
- Host: GitHub
- URL: https://github.com/laguileracl/pdf-ultra-compressor
- Owner: laguileracl
- License: other
- Created: 2025-09-10T17:48:12.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2025-09-11T01:21:55.000Z (9 months ago)
- Last Synced: 2026-04-12T06:38:52.179Z (about 2 months ago)
- Topics: cli, compression, ghostscript, linux, macos, optimizer, pdf, psnr, qpdf
- Language: Python
- Size: 23.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Support: SUPPORT.md
Awesome Lists containing this project
README
# 🚀 PDF Ultra Compressor
[](https://github.com/laguileracl/pdf-ultra-compressor/actions/workflows/ci.yml)
[](LICENSE)
[](CONTRIBUTING.md)
[](https://github.com/laguileracl/pdf-ultra-compressor/discussions)
[](https://github.com/laguileracl/pdf-ultra-compressor/wiki)
Command-line, quality-first PDF optimizer for text- and image-heavy PDFs. Drop files into `input/`, get optimized results in `output/`. Focus: maximum size reduction without perceptible quality loss, with strict “never worse” guards. See `docs/` for more details. For longer docs, visit the [Wiki](https://github.com/laguileracl/pdf-ultra-compressor/wiki) — quick links: [Home](https://github.com/laguileracl/pdf-ultra-compressor/wiki), [Usage](https://github.com/laguileracl/pdf-ultra-compressor/wiki/Usage), [Quality Gates](https://github.com/laguileracl/pdf-ultra-compressor/wiki/Quality-Gates), [Roadmap](https://github.com/laguileracl/pdf-ultra-compressor/wiki/Roadmap).
Keywords: pdf compression, pdf optimizer, ghostscript, qpdf, ocr, jbig2, jpeg2000, lossless, high quality, macos, linux, ci, command line
## Features
- Drop-in folder workflow: put PDFs in `input/`, get results in `output/`.
- Multi-pass strategy: Ghostscript (prepress/printer/ebook) + qpdf.
- Quality-first scoring with “never worse” safeguard (copies original if no gain).
- Optional perceptual quality gate (PSNR) to prevent visible degradation.
- Anonymous telemetry (opt-out) records technical, privacy-safe metrics to improve algorithms. Disable with `--disable-telemetry`.
- New anti-noise mode to suppress artifacts on optimized PDFs (text/gray-safe filters and optional grayscale). Enable with `--anti-noise`.
## Highlights
- 🎯 Smart multi-pass pipeline: Ghostscript + qpdf
- đź§ Quality-first scoring: selects the best candidate (size vs. visual safety)
- 📂 Zero-config workflow: `input/` → `output` (processed moved to `input/processed/`)
- đź§ą Structural cleanup and linearization when possible
- 🛡️ Never-worse guarantee: falls back to original if not improved
## Quick Start (macOS)
Install system tools (recommended):
```bash
brew install ghostscript qpdf
```
Then run:
```bash
# Put PDFs in input/
cp ~/Downloads/my.pdf input/
# Run the compressor (English v1)
python3 compressor.py
# Results in output/
ls output/
```
Alternatively, run the new v1 CLI (English-only):
```bash
python3 compressor.py
```
Telemetry is enabled by default and stores anonymized, technical-only data in `telemetry_data/` locally. To opt out:
```bash
python3 compressor.py --disable-telemetry
```
To reduce compression artifacts/noise in the output (helpful for scanned text docs):
```bash
python3 compressor.py --anti-noise
```
## Folder Layout
```
pdf-ultra-compressor/
├─ input/ # Place PDFs here
│ └─ processed/ # Processed originals are moved here
├─ output/ # Optimized PDFs are written here
├─ compressor.py # Primary CLI optimizer (English v1)
├─ ci/ # Smoke test
├─ install_tools.sh # macOS helper to install ghostscript & qpdf
└─ docs & meta
```
## Typical Results
- Scanned documents: 40–70% reduction
- Image-heavy PDFs: 30–60% reduction
- Mostly text PDFs: 10–30% reduction
- Visual quality: preserved; never-worse guarantee (PSNR gate optional)
## Roadmap
- Add OCRmyPDF + JBIG2 for scanned PDFs (MRC-style pipeline)
- Perceptual quality gates with SSIM/LPIPS (PSNR already available)
## Contributing
Contributions are welcome! Please read `CONTRIBUTING.md` and open an issue or pull request.
## License
MIT — see `LICENSE`.
## Community & Discussions
Have questions, feature ideas, or want to share results? Join the project Discussions: https://github.com/laguileracl/pdf-ultra-compressor/discussions
- Announcements: pinned “Welcome & Roadmap”
- Q&A: ask questions
- Ideas: feature proposals