Turns a directory tree of PDFs into a single bookmarked PDF. Automatically handles the table of contents.
Tested on Linux and Mac.
If you arrange your PDF files in folders like this:
book/01-Table of Contents.pdf
book/02-First Generation/01-Mary Cunningham.pdf
book/02-First Generation/02-Peter Cunningham.pdf
book/02-First Generation/02-:more-notes.pdf
book/03-Second Generation/01-John Mendell Cunningham.pdf
book/99-Index.pdf
and run:
$ pdfdir-join book
you will find the result in "book.pdf"
The PDF's table of contents will be automatically generated from the filenames:
Table of Contents
First Generation
Mary Cunningham
Peter Cunningham
Second Generation
John Mendell Cunningham
Index
The 01-
, 02-
prefixes determine the order of the chapters in the
final book and don't appear in the bookmarks.
If you don't want a file to be added to the TOC, adding a :
to the beginning
of its filename will suppress it (02-:more-notes.pdf
above).
MacOS: brew install ghostscript Linux: apt-get install ghostscript
And also Ruby. Hopefully this is temporary.
This package also includes some tools to help assemble the input files. This will find corrupt PDFs:
$ pdfdir-verify book
It uses Ghostscript to carefully process every page of every PDF file. This is awfully slow. You can specify --quick for a 10X speedup at the risk of missing some obscure corruptions.
If you're having trouble with encrypted or corrupt PDFs, try using pdfdir-copy to duplicate your entire directory structure. It takes a while but, because it re-encodes each PDF, the result is sure to be valid.
$ pdfdir-copy book /tmp/book-fixed