PDF Joiner — What “combine files” actually means
A user drops a docx, three iPhone photos, a TIFF scan, a DjVu book, and an existing PDF into the upload box and clicks Combine. One PDF comes back. Behind the button is a pipeline with a per-format converter for each input type, funneling into a single merge step.
The two-stage pipeline
First, each input file is converted into its own temporary PDF. The
conversion path depends on the format: office documents go through
LibreOffice, raster images through ImageMagick (magick),
DjVu through ddjvu, SVG through rsvg-convert.
A PDF input skips this stage entirely.
Then the temporary PDFs are concatenated:
qpdf --empty --pages file1.pdf file2.pdf file3.pdf -- output.pdf
A separate pass with exiftool writes the title, author,
and creation date into the result. An optional pass renders a thumbnail
of the first page for the UI.
Almost all of the engineering effort lives in stage one.
Why everything goes through PDF
A direct path from JPEG to a final-PDF page is possible, but doing that for twenty input formats means twenty separate code paths into the final document. Each path has to handle paging, metadata, fonts, and color management on its own.
Routing every format through a per-format converter that emits a temporary PDF collapses that into one shared back end. The merge step sees only PDFs, and qpdf, ImageMagick, and exiftool plug straight in.
What gets accepted
A typical combiner accepts:
Raster: JPEG, PNG, TIFF (multi-page), BMP, GIF (first frame only), WebP (animated allowed, first frame used), AVIF, HEIC/HEIF (the iPhone default since 2017).
Vector: SVG.
Office: Word (.doc, .docx,
.rtf, .odt), Excel (.xls,
.xlsx, .ods), PowerPoint (.ppt,
.pptx, .odp).
Ebooks: EPUB, MOBI, FB2, AZW3 through Calibre’s
ebook-convert.
Other: DjVu via ddjvu, XPS via
LibreOffice or GhostXPS, and PDF itself, which is copied through
unchanged.
Where it gets hard
Output size. Thirty 4 MB HEIC photos straight into a PDF gives you a 120 MB file that nobody can email. A combiner re-encodes JPEGs, drops the resolution to something sensible for A4, and runs jpegoptim and mozjpeg.
Page sizing. A 4032×3024 sensor capture at 96 DPI is 42×31 inches. Nothing fits A4 by default. The converter has to compute a DPI that puts the photo on a single page without throwing away resolution.
Metadata. JPEG carries EXIF (capture date, GPS, camera). Office files carry author and revision history. The combiner either drops it or maps it into the PDF Info dictionary. Either choice is a policy decision, not a default.
Orientation. iPhones write pixels in
sensor-landscape and rely on an EXIF Orientation tag for display. A
converter that ignores the tag rotates every portrait shot 90 degrees.
jhead -autorot is the standard fix.
Transparency. PNG, HEIC, and WebP can carry alpha. PDF stores it through SMask soft masks, not Flate, but only if the converter actually preserves it. A lazy pipeline turns transparent backgrounds black or white.
Multi-page input. A single TIFF can hold a hundred scanned pages. An EPUB can hold a thousand. The combiner has to detect this and unpack each page individually before concatenation.
Office fidelity. LibreOffice is not Word. Fonts get substituted, line breaks shift, complex tables drift. Without a Word license on the server, this is unavoidable.