Capturing Full-Page Screenshots in Code: What Actually Works
Learn how to take full page screenshots programmatically using Puppeteer, Playwright, and screenshot APIs — with code, gotchas, and real workarounds.
Full-page screenshots sound simple until you actually try to capture one. Lazy-loaded images don't render. Sticky headers repeat themselves down the page. Infinite scroll never terminates. Cookie banners cover half the viewport. The naive solution — point a headless browser at a URL and call screenshot({ fullPage: true }) — works maybe 60% of the time on real-world sites.
This post walks through the practical options for capturing full-page screenshots from your code, the bugs you'll hit, and how to work around them.
The three approaches that actually work
When you need to capture a full webpage programmatically, you have three realistic paths:
- Run a headless browser yourself using Puppeteer or Playwright
- Use a managed screenshot API like PxShot, where you make an HTTP request and get a PNG/JPEG/WebP/PDF back
- Stitch viewport captures together manually if you need extreme control (rare)
Most developers start with option 1, hit infrastructure pain, and migrate to option 2. Let's look at both in detail.
Option 1: Puppeteer or Playwright
Both libraries support full-page capture out of the box. Here's the minimum viable Puppeteer script:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 800 });
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
await page.screenshot({ path: 'out.png', fullPage: true });
await browser.close();
})();That works for static pages. For modern sites, you need more.
Handling lazy-loaded images
Most production sites use loading="lazy" or IntersectionObserver-based loading. A full-page screenshot won't trigger scroll, so images below the fold stay blank. The fix is to scroll the page first:
await page.evaluate(async () => {
await new Promise((resolve) => {
let total = 0;
const distance = 200;
const timer = setInterval(() => {
window.scrollBy(0, distance);
total += distance;
if (total >= document.body.scrollHeight) {
clearInterval(timer);
window.scrollTo(0, 0);
resolve();
}
}, 100);
});
});Then wait briefly for the last batch of images to decode before taking the shot.
Dismissing cookie banners and modals
You have two options: inject CSS to hide them, or click them away. Hiding is more reliable:
await page.addStyleTag({
content: `
[id*="cookie"], [class*="cookie"],
[id*="consent"], [class*="consent"],
[aria-label*="cookie"] { display: none !important; }
`
});This is brittle — every site uses different selectors — but it covers a surprising amount of ground.
Handling sticky headers
Sticky headers are the most annoying full-page screenshot bug. Because the browser captures the page by scrolling internally, a position: fixed header gets painted at every scroll position, leaving a striped artifact down your image. Force them static:
await page.addStyleTag({
content: `*, *::before, *::after {
position: static !important;
}`
});Use this carefully — it can break layouts. A more surgical version targets known header selectors only.
The infrastructure cost
Once your script works locally, you need to deploy it. Chromium binaries are ~300MB, which exceeds AWS Lambda's default size limit. You'll either need:
- A container-based deployment (Lambda containers, ECS, Fly.io, Railway)
- A pre-built Chromium layer like
@sparticuz/chromium - A dedicated server with enough RAM (each Chromium instance eats 200-500MB)
Then there's concurrency. Chromium leaks memory under load. You need to restart browser instances periodically, queue requests, and monitor failures. This is where many teams give up on self-hosting.
Option 2: Using a screenshot API
If you don't want to run browser infrastructure, an API call is significantly simpler. With PxShot, a full-page capture is one HTTP request:
GET https://api.pxshot.dev/v1/screenshot
?url=https://example.com
&full_page=true
&format=png
&width=1280The response is the binary image. No browser to manage, no Chromium updates to track, no memory leaks. The same endpoint can return PDF if you need print-quality output instead.
When an API makes more sense than self-hosting
- You're generating OG images on demand and need sub-second response times
- You're capturing screenshots from user-submitted URLs (security and isolation matter)
- You need PDF and image output from the same source
- Your traffic is bursty — you don't want to pay for idle browsers
- You're building a SaaS feature and don't want screenshot reliability on your on-call rotation
When to keep it in-house
- You need to authenticate against internal systems before capturing
- You're capturing thousands of pages per minute continuously (the math favors owning hardware at that scale)
- You need custom browser extensions or unusual viewport configurations
Common bugs across both approaches
Whether you use Puppeteer or a hosted API, these issues come up constantly:
Web fonts not loading in time
The screenshot fires before @font-face resolves, leaving fallback fonts in the image. Wait explicitly:
await page.evaluate(() => document.fonts.ready);PxShot handles this server-side via a wait_for_fonts parameter, which is one less thing to wire up.
Animations mid-frame
If a hero section has a fade-in animation, your screenshot might catch it at 40% opacity. Either disable animations with CSS:
* { animation: none !important; transition: none !important; }Or add a fixed delay after networkidle.
Pages that never reach networkidle
Analytics scripts, chat widgets, and polling APIs keep network activity going forever. Don't rely on networkidle0 — use networkidle2 (allows 2 ongoing connections) or a hard timeout combined with domcontentloaded.
Captures over 16384px tall
Chrome has a maximum canvas size. Very long pages (some marketing sites are 30,000px+) will either truncate or fail silently. If you hit this, capture in segments and stitch, or use PDF output which doesn't have the same limit.
A workflow that holds up in production
For most teams shipping this feature, the pragmatic flow looks like:
- Prototype with Puppeteer locally to understand what the page needs
- Identify the 3-4 quirks specific to your target sites (cookie banners, lazy images, animations)
- Either build a small worker service with retry logic and queue management, or call a screenshot API and move on
- Cache results aggressively — most screenshots don't need to be fresh every request
- Set up monitoring on failure rates, because pages change and screenshots silently break
If you want to skip the infrastructure phase entirely, PxShot has a free tier that covers small projects and prototyping — useful for testing whether a hosted approach fits your use case before committing.