Docusaurus PDF Generator

Synopsis

When I have the time, I have been helping out with the Meshtastic project. If you don't know what it is, stop reading this and check it out! The Meshtastic project documentation is served by the software Docusaurus. There was a request on the Meshtastic discord server about the possibility of converting the documentation to a PDF hanbook. I have built PDF generators for other projects, so I decided to take on the task.

Instead of reinventing the wheel and coding right away, I did a quick a internet search for a project that fulfills my needs. I came across docusaurus-prince-pdf. It is a simple script based on Prince PDF that creates a map of any docusaurus instance, scrapes the HTML and generates a PDF. After reviwing the code, I decided to fork the project and re-write it to use wkhtmltopdf for the PDF generating engine instead. I don't like the fact that Prince PDF is a closed source project. That is why I decided to switch to something else and wkhtmltopdf was the natural choice. The reasons were simple: I already have some prior experience using it in different projects, the project is open source and very well documented.

Docusausrus-Wkhtmltopdf

After refactoring the 99% of the code, I integrated wkhtmltopdf as the PDF generator engine. I also added support for PDF file compression with the help of Ghostscript. I have created the following projects on Docker Hub, GitHub and NPM.

  1. GitHub - Project git repo
  2. npm - Project npm repo.
  3. Docker Hub - The complete project in a container. This includes node, wkhtmltopdf and ghostscript.

Command Help

Options:
      --version           Show version number                          [boolean]
  -u, --url               Base URL, should be the baseUrl of the Docusaurus inst
                          ance (e.g. https://docusaurus.io/docs/)       [string]
  -s, --selector          CSS selector to find the link of the next page[string]
  -d, --dest              Working directory. Default to ./pdf           [string]
  -f, --file              Change default list output filename           [string]
  -o, --output            Change PDF output filename                    [string]
      --include-index     Include / (passed URL) in generated PDF
      --prepend           Prepend additional pages, split with comma    [string]
      --append            Append additional pages, split with comma     [string]
      --wkhtmltopdf-args  Additional options for wkhtmltopdf            [string]
      --list-only         Fetch list without generating PDF
      --pdf-only          Generate PDF without fetching list. Ensure list exists
      --toc               Generate the PDF with a table of contents
      --compress          Compress the output file. REQUIRES ghostscript!
  -h, --help              Show help                                    [boolean]

Standard Usage

If you would prefer to skip installation of node, wkhtmltopdf and ghostscript on your machine see the Docker Usage section.

Before getting started, install Wkhtmltopdf. Ghostscript is optional.

Run the following commands to generate the PDF:

# Genrate PDF from specific site under `docs` scope
npx docusaurus-wkhtmltopdf -u https://https://docusaurus.io/docs

# Change generating scope to `/docs/xxx/`
npx docusaurus-wkhtmltopdf -u https://https://docusaurus.io/docs/xxx

# Custom working (output) directory
npx docusaurus-wkhtmltopdf -u https://https://docusaurus.io/docs --dest ./pdf-output

# Custom output file name
npx docusaurus-wkhtmltopdf -u https://https://docusaurus.io/docs --output docs.pdf

# Compress the PDF file
npx docusaurus-wkhtmltopdf -u https://https://docusaurus.io/docs --compress

To generate PDF from a local Docusaurus instance. You need to first build the site locally:

# Build the site
yarn build

# Serve built site locally
yarn serve

# Generate PDF from local Docusaurus instance
npx docusaurus-wkhtmltopdf -u http://localhost:4000/docs # Change port to your serving port

See help screen for more usages:

npx docusaurus-wkhtmltpdf -h

Docker Usage

The docker image prepackages as a complete working environment. Run the following commands to generate your desired PDF:

There are a few things to note concering mounting folders and files.

  1. the generated pdfs are saved to /d2p/pdf in the container.
  2. Custom css markup should is saved in the print.css file in the project root.
# Generate PDF from specific site under `docs` scope 
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs

# Change generating scope to `/docs/xxx/`
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs/xxx

# Custom working (output) directory
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs --dest ./pdf-output

# Custom output file name
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs --output docs.pdf

# Compress the PDF file
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs --compress

# Add a rudmentary table of contents
docker run --rm -v /tmp/pdf:/d2p/pdf nuxnik/docusaurus-to-pdf -u https://https://docusaurus.io/docs --toc