CONCEPT Cited by 1 source

Crawl budget impact of JS complexity¶

Definition¶

Crawl budget is Google's shorthand for the aggregate per-site capacity allocation — how many pages Googlebot will fetch and render from a given site in a given time window. Crawl budget doesn't scale with site size; a 1,000,000-page site and a 1,000-page site don't get linearly-different budgets.

JS complexity impact is the observation that JavaScript-heavy pages cost materially more per render than static HTML pages (full Chromium session, all sub-resources fetched, JS execution, async-work settlement, rendered-DOM emission). Per Google's own docs: "for large sites (10,000+ unique and frequently changing pages), this can impact the site's crawl budget."

The product of the two: on 10,000+ page sites, heavy client-side JS reduces the fraction of the site's URLs that get crawled / rendered / indexed in a given time window. The per-page rendering success rate stays 100 %, but fewer pages fit in the budget.

(Source: sources/2024-08-01-vercel-how-google-handles-javascript-throughout-the-indexing-process.)

The structural equation¶

  crawl_budget_seconds_per_day / per_page_render_cost_seconds
    = pages_crawled_and_rendered_per_day

Crawl budget is set by Google per-site based on site health, popularity, update frequency, server latency, and other signals.
Per-page render cost scales with:
JS bundle size
Number of sub-resources
Async-work wait (API calls, dynamic imports, streaming)
DOM complexity after JS execution

Tuning either lever changes the throughput equation.

What the empirical study shows¶

Vercel + MERJ measured on nextjs.org (the Next.js site, a modestly-sized Next.js App Router application):

JS complexity does not correlate with rendering success rate. 100 % of pages rendered, across minimal-JS and heavily-dynamic CSR pages.
JS complexity does not correlate with rendering delay at nextjs.org scale. p50 / p75 / p90 look similar across per-page-JS-complexity buckets.
The crawl-budget impact doesn't show up on a site this size. nextjs.org isn't at the 10,000+-frequently-changing- pages threshold. The impact is a Google-disclosed rule, not a nextjs.org-observed symptom.

The study quotes Google's large-site-managing-crawl-budget guide to acknowledge that the budget impact is real at scale, even if not visible on the study's site.

Who this affects¶

E-commerce catalogues with 100,000+ product pages, each heavily client-rendered (personalised pricing, availability, recommendations).
Classified / listing sites with millions of frequently- changing pages.
News / publishing platforms with decade-deep archives plus heavy article-page JS (comments widgets, interactive charts, video players).
Location / geo sites where each city / zip / region is its own URL with dynamic content.

What to do about it¶

Per the source post and Google's own guidance:

Use SSG / ISR / SSR for SEO-critical content. The initial HTML body carries the content; per-page rendering cost is lower; per-page render still happens but with less work.
Code-split aggressively. Only load what a given route needs. Google still runs the JS, but smaller bundles = faster render = more pages in the budget.
Avoid JS-only navigation for in-site links. Use real <a href> anchors; the link-discovery regex works on them directly, and the per-page-render cost drops.
Keep sitemap fresh with <lastmod>. Signals which pages are worth spending budget on; short-circuits link-graph traversal.
Minimise blocked resources in robots.txt. A blocked JS file Google can't fetch may force a render retry (more budget spent on one page) or produce a broken rendered DOM.
Consider traffic-aware pre- rendering for the URL shape where build-time pre-render of all pages is impractical — pre-render the hot URLs, ISR the cold ones.

Why it's a crawl-budget phenomenon and not a rendering-delay phenomenon¶

A page's individual rendering delay is p50 = 10 s regardless of JS complexity (at nextjs.org scale). But a site's aggregate rendering-throughput depends on per-page cost, and Google allocates capacity per site — not per page. So a JS-heavy site doesn't see individual pages take longer, it sees fewer pages get crawled per day. The symptom:

Some pages never get rendered within the time window between crawl cycles.
Some pages show stale indexed content.
New pages take longer to appear in search.

This is a different failure mode than "rendering queue delay" (an individual-page latency), and shows up only on sites big enough that the budget divides across many pages.

Seen in¶

sources/2024-08-01-vercel-how-google-handles-javascript-throughout-the-indexing-process — canonical wiki instance. Named as a Google-disclosed scaling rule in the "Overall implications and recommendations" section, anchored on Google's own crawl- budget guide.

systems/googlebot
systems/google-web-rendering-service
concepts/rendering-queue — the adjacent individual-page- latency mechanism.
concepts/rendering-delay-distribution — the per-page distribution, complementary to the aggregate budget view.
concepts/universal-rendering — the upstream property that means every page consumes budget, not just JS-heavy ones.