pdf python tools

518 words, 3 min read

When you need to visualize content extracted from a PDF—such as bounding boxes or detected elements—one practical approach is to render each page as an image and draw your overlays on top of it. PyMuPDF handles the PDF rendering, while Pillow (PIL) provides simple image-drawing operations. This workflow keeps the PDF untouched and focuses only on producing annotated images.

This post shows how to render pages using PyMuPDF, convert them into PIL images, and draw rectangles and text using normalized coordinates.

Rendering a page to an image

PyMuPDF can render a page at any resolution. A zoom factor is used to control the output quality:

import fitz
from PIL import Image

doc = fitz.open("input.pdf")

page = doc[0]
zoom = 2
mat = fitz.Matrix(zoom, zoom)
pix = page.get_pixmap(matrix=mat)

img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)

The Matrix determines the rendered resolution. A zoom level of 2 provides reasonable clarity for overlays.

Drawing rectangles using normalized coordinates

If the coordinates are stored as normalized values (between 0 and 1), you can convert them to pixel coordinates by multiplying by the image width and height. Pillow’s ImageDraw makes drawing straightforward:

from PIL import ImageDraw

rectangles = [
    {"top_left_x": 0.1, "top_left_y": 0.1, "bottom_right_x": 0.4, "bottom_right_y": 0.3},
]

draw = ImageDraw.Draw(img)

for r in rectangles:
    tl_x = r["top_left_x"] * pix.width
    tl_y = r["top_left_y"] * pix.height
    br_x = r["bottom_right_x"] * pix.width
    br_y = r["bottom_right_y"] * pix.height

    draw.rectangle([tl_x, tl_y, br_x, br_y], outline="red", width=3)

The coordinates remain resolution-independent, which is useful when generating images at different zoom levels.

Adding text with a configurable font size

Pillow supports loading external TTF fonts and drawing text anywhere on the image. Font sizes are specified in pixels:

from PIL import ImageFont

font = ImageFont.truetype("arial.ttf", 18)

draw.text((tl_x, tl_y - 30), "Example", fill="red", font=font)

If you skip the custom font and use the default one, Pillow falls back to a small bitmap font, which usually doesn’t scale well.

Putting everything together

Here is a full example that renders all pages and adds overlays to each image:

import fitz
from PIL import Image, ImageDraw, ImageFont

doc = fitz.open("input.pdf")

overlays = {
    0: [
        {"top_left_x": 0.1, "top_left_y": 0.1, "bottom_right_x": 0.4, "bottom_right_y": 0.3},
    ],
}

font = ImageFont.truetype("arial.ttf", 24)

for page_index, page in enumerate(doc):
    zoom = 2
    mat = fitz.Matrix(zoom, zoom)
    pix = page.get_pixmap(matrix=mat)

    img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
    draw = ImageDraw.Draw(img)

    rects = overlays.get(page_index, [])
    for r in rects:
        tl_x = r["top_left_x"] * pix.width
        tl_y = r["top_left_y"] * pix.height
        br_x = r["bottom_right_x"] * pix.width
        br_y = r["bottom_right_y"] * pix.height

        draw.rectangle([tl_x, tl_y, br_x, br_y], outline="red", width=3)
        draw.text((tl_x, tl_y - 30), "Overlay", fill="red", font=font)

    img.save(f"page_{page_index + 1}.png")

This produces a set of annotated PNG files, one per page.

Why use this method?

Rendering pages and drawing overlays on images is useful when:

you only need output images, not modified PDFs,
you want predictable visual results without worrying about PDF font embedding,
you are debugging extraction or OCR pipelines,
you want to overlay machine-learning detections.

When editable vector annotations are required, adding shapes directly to the PDF is the better route. But for visualization and inspection, combining PyMuPDF and Pillow keeps the implementation simple and flexible.

If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.

🐥 Rendering PDF pages and adding overlays using PyMuPDF and PIL

November 18, 2025

Rendering a page to an image

Drawing rectangles using normalized coordinates

Adding text with a configurable font size

Putting everything together

Why use this method?