Running a single headless browser is easy. Running ten is manageable. Running ten thousand? That is an infrastructure nightmare. Memory leaks, zombie processes, and CPU spikes can bring even the most robust clusters to their knees. Here is how we scaled our browser infrastructure to handle millions of pages daily.
The Hidden Cost of Headless Browsers
Chromium is a beast. It is designed to render rich, interactive web experiences for humans, not for automated data extraction. When you run it in "headless" mode, it still carries much of that weight.
A single tab can easily consume 200MB to 500MB of RAM. Multiply that by 1,000 concurrent workers, and you need terabytes of RAM. But the problems aren't just linear resource usage.
1. Aggressive Resource Blocking
The easiest win in performance is simply not doing work. Most websites load megabytes of images, fonts, stylesheets, and tracking scripts that offer zero value for data extraction.
Block Requests by Type:
Intercept network requests at the browser level and abort them if they match resource types like image, font, media, or stylesheet. This saves bandwidth and CPU time used for decoding/rendering.
await page.setRequestInterception(true);
page.on('request', (req) => {
const resourceType = req.resourceType();
if (['image', 'stylesheet', 'font', 'media'].includes(resourceType)) {
req.abort();
} else {
req.continue();
}
});
Block Third-Party Trackers: Load a disconnect list or similar ad-blocking regex list into memory and abort requests to known tracker domains (Google Analytics, Facebook Pixel, etc.). These scripts clog the CPU main thread.
2. Managing the Browser Lifecycle
The "Zombie" Problem: Sometimes, a browser crashes or hangs but the parent process doesn't clean it up legally. You end up with "zombie" Chrome processes eating CPU until the server hits 100% usage and dies.
Solution: The Reaper:
You need an external supervisor process (a "Reaper") that monitors the PIDs. If a process has been detached from its parent for too long or exceeds a hard time limit (e.g., 5 minutes for a single page visit), kill -9 it mercilessly.
Browser Contexts vs. New Browsers:
Launching a fresh chrome instance (browser = launch()) is expensive (seconds). Creating a new Incognito Context (context = browser.createIncognitoBrowserContext()) is cheap (milliseconds).
- Strategy: Launch one browser instance per CPU core.
- Tactics: Rotate contexts for each scrape job to ensure cookies/cache isolation.
- Recycle: Kill and restart the main browser instance every 50-100 jobs to clear out deep-seated memory leaks in Chromium itself.
3. Docker & Orchestration
Do not run browsers on bare metal. Containerize them.
Memory Limits:
Set strict Docker memory limits (--memory=1g). If a browser leaks, the OOM (Out Of Memory) killer sacrifices the container, not your entire server.
Shared Memory (/dev/shm):
By default, Docker gives a tiny /dev/shm (64MB). Chrome uses shared memory extensively for communication between its internal processes. If this fills up, Chrome crashes strangely.
- Fix: Always run containers with
--shm-size=2gor mount/dev/shm:/dev/shm.
Orchestration: Kubernetes (K8s) is ideal here. We use a custom Horizontal Pod Autoscaler based on the size of our job queue (Redis/BullMQ).
- Queue larger than 1000 items? Spin up 50 more pods.
- Queue empty? Scale down to save money.
4. Code-Level Optimizations
Promise.all is your friend:
Don't wait for things sequentially if you don't have to. If you are scraping a list of 10 items on a page, extract data in parallel.
Disable requestAnimationFrame:
In a headless environment, you don't care about smooth 60fps animations. Monkey-patch window.requestAnimationFrame to run immediately or less frequently to save CPU cycles on animation-heavy sites.
await page.evaluateOnNewDocument(() => {
window.requestAnimationFrame = (callback) => {
return setTimeout(callback, 100); // reduced frequency
};
});
5. Using Serverless? Think Twice.
AWS Lambda or Google Cloud Functions seem perfect for "bursty" scraping. However, the cold-start time of launching Chromium in a lambda is painful (3-5 seconds). Plus, you have strict time limits.
For long-running crawls or high-volume scraping, a dedicated cluster of "warm" workers is usually 10x cheaper and 5x faster than serverless functions.
Conclusion
Scaling headless browsers is an exercise in resource constraint. You are fighting against a piece of software (Chrome) that wants to eat every byte of RAM it can find. Your job is to put it in a straitjacket — stripping away the UI, blocking the bloat, and ruthlessly killing it when it misbehaves.