Sean Hamlin
|Mar 19, 2024
Mar 19, 2024
|6 min read
Search Topic
The frequent need to transform HTML to PDF documents is one of many sophisticated technologies that enhance user experience and interactivity, and it has been ushered in by today's need to create and manage digital content. As web pages built in Drupal, Joomla, TYPO3, WordPress, and similar technologies become increasingly complex through advanced JavaScript frameworks and intricate CSS styling, transforming HTML to PDF with accuracy and high quality has become increasingly challenging.
Both Gotenberg and Lagoon are open source technologies offering powerful solutions to different modern web development challenges. Gotenberg provides a straightforward, Docker-powered PDF generation tool capable of rendering modern web pages, including those built with JavaScript and advanced CSS. Lagoon brings the capability to deploy containerized applications quickly, complementing Gotenberg by providing the ability to deploy to infrastructure geared for scalability and flexibility.
Developers of Drupal, Joomla, TYPO3, WordPress, and other open source applications, languages, and frameworks regularly leverage JavaScript tools like Angular, React, and Vue.js to create highly interactive and dynamic user experiences. Similarly, advanced CSS techniques and animations are frequently used to bring web designs to life, ensuring each web page has an engaging, visually appealing user experience. However, these advancements present a challenge for converting HTML to PDF.
Traditional methods of PDF generation often fall short when dealing with modern web pages. These methods typically struggle to execute JavaScript or correctly apply modern CSS, resulting in PDFs that are far from their HTML web counterparts. The heart of the issue lies in the need for a complete browser rendering engine capable of interpreting and executing JavaScript, applying CSS, and accurately laying out the page as seen in a web browser. Without this ability, the generated PDFs won’t accurately reflect the source HTML.
Furthermore, security considerations add another layer of complexity when converting HTML to PDF. Generating PDFs from web content often involves rendering third-party code on the server side, which can introduce security vulnerabilities if not handled carefully. Given the myriad of content modern web pages can contain, ensuring that the HTML-to-PDF conversion process is accurate and secure presents a significant challenge for developers and organizations.
Gotenberg is an open source project that simplifies the process of converting HTML to PDFs - it achieves this by harnessing the power of a complete browser rendering engine, which ensures that JavaScript is executed, CSS is applied correctly, and the layout is preserved accurately, mirroring the web page's appearance in a browser.
Gotenberg provides Docker-powered options that are out of the box, which makes it both scalable and easy to deploy on amazee.io. By containerizing the HTML-to-PDF conversion process, Gotenberg facilitates a more straightforward integration into existing workflows and also enhances the security of generating PDFs.
Lagoon complements Gotenberg's capabilities by offering a robust mechanism for deploying containerized applications with ease and flexibility.
By leveraging the full browser rendering capabilities of Gotenberg within the flexible, secure hosting environment provided by amazee.io and Lagoon, developers can generate high-quality HTML to PDF documents in a scalable, reliable, and manageable way.
The integration of Gotenberg and Lagoon offers a testament to the power of open source solutions in solving complex web development challenges.
The integration of Gotenberg with Lagoon begins with creating a Docker container for Gotenberg. This container can then be deployed within a Lagoon environment configured to handle container orchestration and management. This approach's primary advantage is its ease of deployment and scalability. Lagoon is designed to support Docker containers natively, ensuring that Gotenberg can be seamlessly integrated into any web development project managed through Lagoon.
A typical `docker-compose.yml` file for deploying Gotenberg in a Lagoon project might include something like this:
gotenberg: image: gotenberg/gotenberg:8 ports: - "3000" # Find port on host with `docker-compose port gotenberg 3000` labels: lagoon.type: basic lagoon.autogeneratedroute: false lando.type: basic environment: << : *default-environment # loads the defined environment variables from the top
Note that you can control whether or not Gotenberg is exposed to the internet with the label
lagoon.autogeneratedroute: false
This configuration defines a Gotenberg service, specifying the use of the Gotenberg Docker image. It also sets up environment variables to customize the Gotenberg instance, for instance, enabling Google Chrome for HTML to PDF conversions while disabling Unoconv for document conversions.
Security considerations are essential when integrating HTML to PDF processes into web applications, especially considering the potential risks of executing JavaScript and rendering HTML from potentially untrusted sources. By deploying Gotenberg as a Docker container within Lagoon, several security best practices are inherently applied:
Isolation: Docker containers provide a level of isolation, ensuring that the HTML to PDF process is sandboxed away from other parts of the application.
Limited Permissions: The container can be configured to run with limited permissions, reducing the risk of unauthorized access or malicious exploitation.
Internal Networking: Using Docker’s internal networking capabilities, Gotenberg can communicate with other services within its Lagoon project namespace without exposing itself to the outside world. This reduces the surface area for potential attacks.
Additionally, Lagoon’s deployment environment is designed with security in mind, offering features like security patching of base Docker images and compliance with security best practices. When Gotenberg is deployed within this environment, it benefits from these security measures.
Integrating Gotenberg with Lagoon for HTML to PDF conversion means web applications can safely convert dynamic, complex HTML to PDF. This setup supports secure, internal service communication, where Gotenberg receives HTML content from the application, processes it, and returns the generated PDF. This process is streamlined and secure, thanks to the infrastructure and security measures provided by Lagoon.
To showcase this, a simple Node.js frontend was written to interact with the Gutenberg service, which you can find here: https://node.master.gotenberg-showcase.au2.amazee.io
Enter any valid URL to turn it into a PDF, press submit,
and the resulting PDF from the HTML input will look something like this:
The complete code for the example can be found at https://github.com/seanhamlin/gotenberg-showcase.
Integrating these two open source projects represents an advancement in how teams can leverage HTML to PDF conversion as a service for their other web applications and websites. At amazee.io, customers use this approach to power emails with PDF attachments and the “Download this page as PDF'' functionality.
Converting HTML to PDF with Gotenberg and Lagoon addresses a common yet complex challenge for software such as Drupal, Joomla, TYPO3, WordPress, and other similar technologies with an elegant and powerful solution. Developers and organizations are encouraged to explore the benefits of this integration, leveraging its capabilities to navigate the complexities of modern web applications and document generation with confidence and ease.
amazee.io is not just a hosting solution but a strategic digital enabler. We empower enterprises to sculpt their digital infrastructure according to their unique visions and operational needs - including tricky or complex issues such as HTML to PDF conversion. We help our clients unlock these benefits without operating their own platform or building/training their Platform Operations team. By working with our 24/7 team of experts, you enjoy the benefits of 24/7 infrastructure without additional headcount.
Interested in learning more about leveraging containers for covering advanced hosting use cases? Contact us today!