When you need to visualize content extracted from a PDF—such as bounding boxes or detected elements—one practical approach is to render each page as an image and draw your overlays on top of it. PyMuPDF handles the PDF rendering, while Pillow (PIL) provides simple image-drawing operations. This workflow keeps the PDF untouched and focuses only on producing annotated images.
This post shows how to render pages using PyMuPDF, convert them into PIL images, and draw rectangles and text using normalized coordinates.
Rendering a page to an image
PyMuPDF can render a page at any resolution. A zoom factor is used to control the output quality:
import fitz
from PIL import Image
doc = fitz.open("input.pdf")
page = doc[0]
zoom = 2
mat = fitz.Matrix(zoom, zoom)
pix = page.get_pixmap(matrix=mat)
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
The Matrix determines the rendered resolution. A zoom level of 2 provides reasonable clarity for overlays.
Drawing rectangles using normalized coordinates
If the coordinates are stored as normalized values (between 0 and 1), you can convert them to pixel coordinates by multiplying by the image width and height. Pillow’s ImageDraw makes drawing straightforward:
from PIL import ImageDraw
rectangles = [
{"top_left_x": 0.1, "top_left_y": 0.1, "bottom_right_x": 0.4, "bottom_right_y": 0.3},
]
draw = ImageDraw.Draw(img)
for r in rectangles:
tl_x = r["top_left_x"] * pix.width
tl_y = r["top_left_y"] * pix.height
br_x = r["bottom_right_x"] * pix.width
br_y = r["bottom_right_y"] * pix.height
draw.rectangle([tl_x, tl_y, br_x, br_y], outline="red", width=3)
The coordinates remain resolution-independent, which is useful when generating images at different zoom levels.
Adding text with a configurable font size
Pillow supports loading external TTF fonts and drawing text anywhere on the image. Font sizes are specified in pixels:
from PIL import ImageFont
font = ImageFont.truetype("arial.ttf", 18)
draw.text((tl_x, tl_y - 30), "Example", fill="red", font=font)
If you skip the custom font and use the default one, Pillow falls back to a small bitmap font, which usually doesn’t scale well.
Putting everything together
Here is a full example that renders all pages and adds overlays to each image:
import fitz
from PIL import Image, ImageDraw, ImageFont
doc = fitz.open("input.pdf")
overlays = {
0: [
{"top_left_x": 0.1, "top_left_y": 0.1, "bottom_right_x": 0.4, "bottom_right_y": 0.3},
],
}
font = ImageFont.truetype("arial.ttf", 24)
for page_index, page in enumerate(doc):
zoom = 2
mat = fitz.Matrix(zoom, zoom)
pix = page.get_pixmap(matrix=mat)
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
draw = ImageDraw.Draw(img)
rects = overlays.get(page_index, [])
for r in rects:
tl_x = r["top_left_x"] * pix.width
tl_y = r["top_left_y"] * pix.height
br_x = r["bottom_right_x"] * pix.width
br_y = r["bottom_right_y"] * pix.height
draw.rectangle([tl_x, tl_y, br_x, br_y], outline="red", width=3)
draw.text((tl_x, tl_y - 30), "Overlay", fill="red", font=font)
img.save(f"page_{page_index + 1}.png")
This produces a set of annotated PNG files, one per page.
Why use this method?
Rendering pages and drawing overlays on images is useful when:
- you only need output images, not modified PDFs,
- you want predictable visual results without worrying about PDF font embedding,
- you are debugging extraction or OCR pipelines,
- you want to overlay machine-learning detections.
When editable vector annotations are required, adding shapes directly to the PDF is the better route. But for visualization and inspection, combining PyMuPDF and Pillow keeps the implementation simple and flexible.
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.