CONCEPT Cited by 1 source
Google asset-caching internal heuristics¶
Definition¶
Google's Web Rendering Service does not obey HTTP
Cache-Control headers when deciding whether to re-fetch a
page's sub-resources (CSS, JavaScript, images, API responses,
fonts). Instead, WRS applies its own internal freshness
heuristics, separate from the page's declared caching policy.
Canonical verbatim: "Google speeds up webpage rendering by
caching assets, which is useful for pages sharing resources and
for repeated renderings of the same page. Instead of using HTTP
Cache-Control headers, Google's Web Rendering Service employs
its own internal heuristics to determine when cached assets are
still fresh and when they need to be downloaded again."
(Source: sources/2024-08-01-vercel-how-google-handles-javascript-throughout-the-indexing-process.)
Why WRS decides caching itself¶
WRS renders pages at planet-scale for indexing. Two capacity pressures:
- Shared resources across pages. Many pages on a site
share CSS / JS bundles, fonts, images. Honouring per-response
Cache-Controlwould force WRS to re-evaluate freshness per asset per render — expensive at WRS scale. - Repeated renders of the same page. WRS may re-render the same URL periodically as freshness signals change. Asset caching across those renders is a significant cost save.
Running the cache-freshness decision on internal heuristics
lets WRS amortise asset downloads across renders without
trusting any individual page's Cache-Control declarations
(which may be incorrect, over-aggressive, under-aggressive, or
set for very different cacheability targets than Google's
needs).
What this means in practice¶
Cache-Control: no-storeon a JS file does not guarantee a fresh fetch on every render. WRS may still use its cached copy.Cache-Control: max-age=31536000on a CSS bundle does not mean Google treats it as immutable for a year. WRS's heuristic may re-fetch sooner than the declared max-age.- Content-addressed asset URLs (fingerprinted filenames,
app.abc123.js) still work correctly. The URL change is a strong signal that Google's heuristic is very likely to honour — a new URL is a new asset. - Cache-busting query parameters (
?v=123) may not reliably bust the cache. Per the source post, the heuristic is opaque; behaviour is not guaranteed.
Interaction with sub-resource changes¶
A page that depends on a recently-changed JS file may render against an old cached version of that file from Google's perspective — producing stale rendered output relative to what users see. The canonical mitigation:
- Fingerprint / content-hash asset URLs so a change produces
a new URL (e.g.
app.abc123.js→app.def456.js). Google's heuristic is far less likely to have a stale copy of a fresh URL than to have a stale copy of an unchanged URL. - Prefer server-rendered HTML for SEO-critical content so stale-JS-execution is not a load-bearing path for the indexable DOM.
What WRS does not do¶
- Honour
Cache-Controlheaders for asset-cache freshness decisions at render time. - Expose its heuristic to site operators — the precise rules are not disclosed; only that they exist and are independent of the page's declared policy.
- Differentiate based on
Cache-Controldirectives likeno-cachevsprivatevsmust-revalidate. From the builder's standpoint, it's opaque.
Scope — not a general HTTP-caching critique¶
This is specifically about the rendering-stage asset cache
inside WRS. It is not a statement about CDN caches, browser
caches, service-worker caches, or any other HTTP-caching layer
in the stack — those all obey Cache-Control conventionally.
The anomaly is confined to Googlebot's render pipeline.
Seen in¶
- sources/2024-08-01-vercel-how-google-handles-javascript-throughout-the-indexing-process — canonical wiki instance. Named as the fifth WRS property in the Vercel + MERJ study's post-2018 pipeline description. Anchored by the link to Google's own docs confirming the internal-heuristics design.
Related¶
- systems/google-web-rendering-service — where the heuristics live.
- systems/googlebot — the upstream pipeline.
- concepts/rendering-queue — the adjacent capacity- management mechanism.
- concepts/universal-rendering — the upstream property that makes shared asset caching a meaningful capacity save.