How can I add a transparent watermark to a PDF using Python?

July 11, 2023, 5:39 a.m.
How to add a Transparent Watermark to a PDF using Python
How can I add a transparent watermark to a PDF using Python?
Python, a high-level, interpreted programming language, is known for its powerful libraries that have been created to facilitate complex processes. The language’s robustness and simplicity have made it popular among software engineers, mathematicians, data analysts, and web developers alike.
One of the common tasks across these professions involves working with files. In many scenarios, you might need to interact with Portable Document Format (PDF) files — possibly to add watermarks. Originated by Adobe, PDF is a file format used to present and exchange documents reliably, independent of software, hardware, or operating system.
Watermarking is an important feature commonly used to protect and claim ownership of documents or to denote specific states of the documents like "Confidential", "Draft", etc. In this article, we'll guide you through a step-by-step process on how to add a transparent watermark to a PDF document using Python.

Installing necessary packages

Before we start with our Python code, we need to ensure that certain packages are installed in our Python environment. These packages will aid in reading and editing PDF files. We're going to use PyPDF2, a library that you can use to split, merge, and transform the pages in a PDF. We also need pdf2image, a library that helps to convert the pages of your PDF into images. These packages can be installed via pip, which is a package manager for Python.

To install these packages, use the following commands:

```bash

pip install PyPDF2

pip install pdf2image

```

Creating transparent watermark

Firstly, we need to create a transparent watermark. For that, the PIL (Python Imaging Library) is used, which supports a wide variety of images such as “jpeg”, “png”, etc. We can also use PyMuPDF library, which is a Python binding to the MuPDF library that can be used to superimpose pages.

To create a transparent watermark:

```python

from PIL import Image

import PyPDF2

# Create image object

img = Image.new('RGB', (100, 30), color = (73, 109, 137))

# Set the position of the watermark

d.text((10,10), "Hello, World", fill=(255,255,0))

img.save('watermark.png')

```

Merging PDF with watermark

After our watermark is ready, we need to incorporate this into our PDF. We can achieve this by using the 'PdfWriter' class from the PyPDF2 library to create a new PDF file, then using the 'mergePage()' function to merge the watermark image with the original PDF pages.

```python

# Import reader and writer from the pypdf2 module

from PyPDF2 import PdfReader, PdfWriter

# Open the file in binary mode

watermark = open('watermark.png', 'rb')

# Create a pdf reader object

reader1 = PyPDF2.PdfReader("original.pdf")

reader2 = PyPDF2.PdfReader(watermark)

# Create writer object for output pdf

pdf_writer = PyPDF2.PdfWriter()

# Iterating through the pdf pages

for page_num in range(reader1.getNumPages()):

page = reader1.getPage(page_num)

page.mergePage(reader2.getPage(0))

writer.addPage(page)

# Writing to an output pdf

with open("watermarked.pdf", "wb") as output_pdf:

pdf_writer.write(output_pdf)

```

This script will loop through all pages in the PDF file, apply the watermark, and save the watermarked pages to a new PDF.
Please note that the transformations i.e., merging, rotating, scaling or translating the PDF, doesn't modify the original PDF file.
In conclusion, Python's myriad of libraries makes it a suitable language for a wide range of tasks, including but not limited to adding watermarks to PDFs. By understanding the utility of each library and the specific methods, functions, and classes they contain, you can use Python to complete this task efficiently and effectively.

Check out our services

Check out our product HelpRange. It is designed to securely store (GDPR compliant), share, protect, sell, e-sign and analyze usage of your documents.

Other Posts: