https://github.com/bronson/pdfdir
Utilities to operate on lots of PDF files
https://github.com/bronson/pdfdir
pdf pdf-files
Last synced: about 2 months ago
JSON representation
Utilities to operate on lots of PDF files
- Host: GitHub
- URL: https://github.com/bronson/pdfdir
- Owner: bronson
- Created: 2009-03-26T18:56:51.000Z (over 16 years ago)
- Default Branch: master
- Last Pushed: 2021-04-02T22:37:27.000Z (over 4 years ago)
- Last Synced: 2025-04-09T15:00:52.207Z (6 months ago)
- Topics: pdf, pdf-files
- Language: Shell
- Homepage:
- Size: 104 KB
- Stars: 24
- Watchers: 3
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## pdfdir
Turns a directory tree of PDFs into a single bookmarked PDF.
Automatically handles the table of contents.Tested on Linux and Mac.
### Usage
If you arrange your PDF files in folders like this:
book/01-Table of Contents.pdf
book/02-First Generation/01-Mary Cunningham.pdf
book/02-First Generation/02-Peter Cunningham.pdf
book/02-First Generation/02-:more-notes.pdf
book/03-Second Generation/01-John Mendell Cunningham.pdf
book/99-Index.pdfand run:
$ pdfdir-join book
you will find the result in "book.pdf"
The PDF's table of contents will be automatically generated from the filenames:
Table of Contents
First Generation
Mary Cunningham
Peter Cunningham
Second Generation
John Mendell Cunningham
IndexThe `01-`, `02-` prefixes determine the order of the chapters in the
final book and don't appear in the bookmarks.If you don't want a file to be added to the TOC, adding a `:` to the beginning
of its filename will suppress it (`02-:more-notes.pdf` above).### Prerequisites
MacOS: brew install ghostscript
Linux: apt-get install ghostscriptAnd also Ruby. Hopefully this is temporary.
### Verify PDFs
This package also includes some tools to help assemble the input files.
This will find corrupt PDFs:$ pdfdir-verify book
It uses Ghostscript to carefully process every page of every PDF file.
This is awfully slow. You can specify --quick for a 10X speedup
at the risk of missing some obscure corruptions.### Re-encode PDFs
If you're having trouble with encrypted or corrupt PDFs, try using
pdfdir-copy to duplicate your entire directory structure. It takes
a while but, because it re-encodes each PDF, the result is sure to
be valid.$ pdfdir-copy book /tmp/book-fixed