PythonAutomationScreenshots

Automating Web Page Screenshots in Python: 5 Proven Patterns

Learn web page screenshot automation in Python with working code examples for Playwright, Selenium, and HTTP-based APIs—plus when to use each.

2026-05-275 min read

Capturing screenshots of web pages programmatically sounds simple until you actually try to do it at scale. Fonts don't render, JavaScript hasn't finished executing, cookie banners cover the content, and your headless Chrome process starts eating 2GB of RAM on a t3.micro. If you've hit any of these walls, you already know that web page screenshot automation in Python is less about taking the picture and more about controlling everything around it.

This post walks through five practical patterns I've used in production—what they're good for, where they break, and the Python code to make each work.

Pattern 1: Playwright for Full Control

Playwright is the strongest option when you need to interact with the page before capturing it—logging in, dismissing modals, scrolling, waiting for specific elements. It supports Chromium, Firefox, and WebKit, and the Python API is well-maintained by Microsoft.

Install it:

pip install playwright
playwright install chromium

A minimal full-page screenshot:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page(viewport={"width": 1440, "height": 900})
    page.goto("https://example.com", wait_until="networkidle")
    page.screenshot(path="out.png", full_page=True)
    browser.close()

When to reach for Playwright

You need to authenticate or fill forms before capturing
You want to wait for specific selectors, not just network idle
You're testing visual regressions and need pixel-perfect determinism
You control the infrastructure and can manage browser processes

What goes wrong

Memory. Every Chromium instance is heavy, and if you're spawning them in a Flask request handler you'll OOM within minutes. Use a worker queue (Celery, RQ, or arq) and reuse browser contexts where possible.

Pattern 2: Selenium When You Already Have It

If your team already uses Selenium for end-to-end tests, piggybacking on that infrastructure makes sense. The screenshot API is straightforward:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

opts = Options()
opts.add_argument("--headless=new")
opts.add_argument("--window-size=1440,900")

driver = webdriver.Chrome(options=opts)
driver.get("https://example.com")
driver.save_screenshot("out.png")
driver.quit()

For full-page captures Selenium is clumsier than Playwright—you typically need to resize the window to the document height or use Chrome DevTools Protocol directly. Stick with Selenium if it's already in your stack; don't pick it for new screenshot work.

Pattern 3: HTTP API for Everything Else

The browser-on-your-server approach falls apart fast when:

You're running on serverless (Lambda, Cloud Functions, Vercel)
You need to capture hundreds of URLs per minute
You don't want to maintain Chromium versions, fonts, and security patches
The screenshots are a small part of a larger app

This is where a screenshot API earns its keep. PxShot exposes a single HTTP endpoint that returns PNG, JPEG, WebP, or PDF, which means your Python code shrinks to a single requests.get call:

import requests

params = {
    "url": "https://example.com",
    "format": "png",
    "full_page": "true",
    "viewport_width": 1440,
    "token": "YOUR_API_KEY",
}
r = requests.get("https://api.pxshot.dev/v1/screenshot", params=params)
with open("out.png", "wb") as f:
    f.write(r.content)

No browser binaries to ship, no --no-sandbox flags, no zombie processes. If you're building OG image generation for a SaaS or doing link-preview thumbnails, this is almost always the right call.

Pattern 4: Batch Capture with asyncio

When you have a list of 500 URLs to capture nightly, sequential requests are wasteful. Whether you're driving Playwright or an HTTP API, async parallelism cuts wall time dramatically.

import asyncio, httpx

URLS = ["https://example.com", "https://news.ycombinator.com", ...]

async def shoot(client, url):
    r = await client.get(
        "https://api.pxshot.dev/v1/screenshot",
        params={"url": url, "token": "YOUR_API_KEY"},
        timeout=60,
    )
    name = url.replace("://", "_").replace("/", "_") + ".png"
    with open(name, "wb") as f:
        f.write(r.content)

async def main():
    async with httpx.AsyncClient() as client:
        sem = asyncio.Semaphore(10)
        async def bound(u):
            async with sem:
                await shoot(client, u)
        await asyncio.gather(*(bound(u) for u in URLS))

asyncio.run(main())

The semaphore caps concurrency so you don't hit rate limits or exhaust file descriptors. Ten parallel requests is a safe starting point; tune from there.

Pattern 5: PDF Generation as a Side Effect

PDFs are often treated as a separate problem—WeasyPrint, ReportLab, wkhtmltopdf. But if the source is already a rendered HTML page, screenshot tooling does the job:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com/invoice/123", wait_until="networkidle")
    page.pdf(path="invoice.pdf", format="A4", print_background=True)
    browser.close()

Same with PxShot—change format=png to format=pdf and you get a paged document back. Useful for invoices, reports, and any case where the canonical version lives as a web page.

Picking the Right Approach

A rough decision tree:

One-off scripts, local development: Playwright. The setup is fine and you have full control.
Existing Selenium test suite: stay there.
Production SaaS, serverless, or anything user-facing: HTTP API. You don't want to be the person debugging why Chrome crashes on Lambda at 2am.
High-volume batch jobs: HTTP API with asyncio. Browser-per-request doesn't scale; managed infrastructure does.
Visual regression testing: Playwright with its built-in toHaveScreenshot matchers.

Common Gotchas Worth Knowing

Lazy-loaded images: scroll the page or use wait_until="networkidle". For Playwright, scrolling to the bottom then back to top triggers most IntersectionObserver-based lazy loaders.
Cookie banners and consent modals: inject CSS to hide them (page.add_style_tag(content="#cookie-banner{display:none}")) or click the dismiss button explicitly.
Web fonts: wait for document.fonts.ready before capturing, otherwise you'll get fallback fonts in the screenshot.
Dark mode: pass color_scheme="dark" in Playwright or the relevant query param in your API.
Retina/HiDPI: set device_scale_factor=2 if your screenshots will be displayed on high-DPI screens.

If you'd rather skip the browser-management work entirely, PxShot has a free tier at pxshot.dev—enough requests to wire up OG images or a link-preview feature without thinking about Chromium ever again.