PATTERN Cited by 1 source
Spatial prefetch on access¶
Definition¶
When a data item is accessed, speculatively load neighbouring items into the cache as well — on the assumption that spatial locality holds for the workload, so the next accesses will be adjacent to this one. A single miss pays for the initial fetch + extra neighbours; the subsequent nearby accesses all hit.
Canonical framing¶
Ben Dicken's 2025-07-08 photo-album example (Source: sources/2025-07-08-planetscale-caching):
When a user clicks on one photo from their cloud photo storage, it's likely that the next photo they will view is the photo taken immediately after it chronologically. In these situations, the data storage and caching systems leverage this user behavior. When one photo is loaded, we can predict which ones we think they will want to see next, and prefetch those into the cache as well.
And the generalisation:
This prefetching of related data improves performance when there are predictable data access patterns, which is true of many applications beyond photo albums.
Where this pattern appears¶
| Tier | Prefetch unit | Trigger |
|---|---|---|
| CPU hardware | Cache line (64 B) | Memory access — the HW fetches the surrounding 63 B for free |
| CPU predictive | Next line / stride | Hardware prefetcher detects strided access pattern |
| OS readahead | Next N filesystem pages | read() detected as sequential; OS pulls ahead |
| Page cache prefetch | Adjacent file pages | madvise(MADV_SEQUENTIAL) / posix_fadvise() hints |
| Database range scan | Next B+tree leaf pages | Ordered scan detected; storage engine walks the linked-list of leaves |
| Application prefetch | Next/prev N items | Photo album, video thumbnails, pagination |
| CDN / browser | Speculative next-page fetch | Predictive based on user's navigation pattern |
Implementation choices¶
- Static prefetch width. Always fetch N neighbours on every access. Cheap, predictable, sometimes wrong. The Dicken photo-album example: "load the next and previous few photos".
- Stride-detecting prefetch. Observe access order; if it's monotonic, widen the prefetch window. Hardware CPU prefetchers do this natively.
- Content-aware prefetch. Use application signals — user's chronological photo order, a client's paginated cursor — to choose what counts as "neighbouring."
- Model-driven prefetch. ML-predicted likely-next items (patterns/asset-preload-prediction). Costly but useful on high-value hot paths.
When prefetching is net-negative¶
Prefetching is worth it only when the prefetched hit probability is high enough to amortise the extra fetch and cache-slot cost.
- Uniformly random access — neighbours are no more likely than anything else. Prefetch wastes bandwidth and pollutes cache.
- Narrow working set that already fits — nothing to save; prefetching only adds overhead.
- Under eviction pressure — each prefetched item evicts a genuine hot item. If temporal locality is already strong, that's a net loss.
- Expensive-to-fetch neighbours — high-resolution images, encrypted payloads, cross-region fetches. Even high prefetch-hit probability may not pay back the per-fetch cost.
Seen in¶
- sources/2025-07-08-planetscale-caching — Ben Dicken's photo-album example + the visual demo where each database cell click loads that cell + its two neighbours into cache. Canonical application-tier instance of this pattern with explicit generalisation beyond photos.
- Hardware CPU prefetcher (implicit across the wiki's performance corpus) — every cache-locality-sensitive post on the wiki benefits from HW prefetch on sequential access; see the Cloudflare trie-hard + Netflix Vector API case studies.
- Linux OS readahead (implicit) — mmap-heavy workloads
tune
fadviseflags explicitly; sequential scans benefit from kernel-default readahead.
Related¶
- concepts/spatial-locality-prefetching — the access- pattern property this pattern exploits.
- concepts/cache-hit-rate — the metric prefetching targets.
- concepts/cpu-cache-hierarchy — cache lines are the implicit prefetch unit at the hardware tier.
- concepts/linux-page-cache — the OS-level layer where readahead lives.
- concepts/hdd-sequential-io-optimization
- patterns/pair-fast-small-cache-with-slow-large-storage
- patterns/asset-preload-prediction