CONCEPT Cited by 1 source
Spatial locality (prefetching)¶
Definition¶
Spatial locality is the property that data stored near (physically adjacent, sequentially indexed, temporally contiguous) a just-accessed item is more likely to be accessed next. Unlike temporal locality (the same item is re-accessed), spatial locality predicts nearby items.
Why it matters¶
When spatial locality holds, cache and storage systems can prefetch — load neighbouring items at access time — so subsequent accesses hit instead of miss. This is a one-miss- amortised-over-many-hits pattern.
Ben Dicken's worked example (Source: sources/2025-07-08-planetscale-caching):
Consider a photo album app. When a user clicks on one photo from their cloud photo storage, it's likely that the next photo they will view is the photo taken immediately after it chronologically. In these situations, the data storage and caching systems leverage this user behavior. When one photo is loaded, we can predict which ones we think they will want to see next, and prefetch those into the cache as well. In this case, we would also load the next and previous few photos.
Visualisation from the post: each database cell is numbered in sequence; clicking one loads that cell + its two neighbours into cache.
Tiers at which spatial locality lives¶
Every layer of the caching stack has a spatial-locality analogue:
- CPU cache line — the L1/L2/L3 cache doesn't store
individual bytes; it stores 64-byte (typical) cache
lines. Accessing byte
xpulls bytesx+1…x+63into cache too, cheaply. Sequential memory walks hit; random- access walks miss on every new cache line. See concepts/cpu-cache-hierarchy + concepts/cache-locality. - Disk block / page — disks and SSDs transfer in fixed-size blocks (typically 4 KB filesystem blocks or 16 KB InnoDB pages). One byte of an in-demand page pulls the whole page into page cache. Sequential scans benefit far more than random seeks.
- Database range scans / index traversal — B+tree leaves are linked; range scans walk sequential leaf pages with good spatial locality (one I/O per page amortised over many rows). Random-PK inserts destroy this property — see concepts/uuid-primary-key-antipattern.
- OS readahead — Linux's readahead heuristic detects sequential I/O and prefetches the next N pages before the application asks.
- HDD sequential vs random I/O — sequential reads are orders of magnitude faster than random because there's no seek between adjacent blocks. See concepts/hdd-sequential-io-optimization.
- Photo album / content feed — the Dicken example. Prefetch next-N + previous-N items; first visible on low- bandwidth mobile where swiping forward looks instant.
Cost model¶
Prefetching is worth it only when:
- Hit probability of prefetched items is high (spatial locality actually holds for this workload).
- Marginal cost (cache slot + fetch bandwidth) is low relative to the miss cost it avoids.
- Eviction pressure from prefetched items doesn't harm hit rate on other hot data.
Over-aggressive prefetch on a random-access workload is worse than no prefetch — it costs fetch bandwidth and evicts genuinely hot data for speculative loads that never hit.
Relationship to temporal locality¶
Spatial and temporal locality are independent axes:
| Workload | Temporal locality | Spatial locality | Example |
|---|---|---|---|
| Social feed (head) | High | Low | Twitter timeline — same item re-read, but adjacent items rarely co-read |
| Photo album | Low | High | Next photo often viewed; same photo rarely re-viewed |
| Database range scan | Low | High | Sequential leaves; each row read once |
| OLTP hot-row write | High | Low | Same row written repeatedly |
| Memory-mapped binary | High | High | Hot code executes adjacent bytes, repeatedly |
Different caches must rely on different signals. LRU alone exploits temporal locality; sequential prefetch alone exploits spatial locality; production caches use both.
Seen in¶
- sources/2025-07-08-planetscale-caching — Ben Dicken's photo-album worked example + "load the next and previous few photos" prefetch rule. Canonical application-tier instance of spatial locality as a caching substrate assumption.
Related¶
- concepts/cache-hit-rate
- concepts/temporal-locality-recency-bias
- concepts/cpu-cache-hierarchy — same principle at the CPU cache-line layer.
- concepts/linux-page-cache
- concepts/hdd-sequential-io-optimization
- concepts/b-plus-tree — range scans rely on it.
- patterns/spatial-prefetch-on-access — the pattern that names this access-on-neighbour shape.