CONCEPT Cited by 1 source

Cache-coherency witness¶

Definition¶

A cache-coherency witness is a separate component that observes writes and gates reads so that a distributed cache fronting a persistent store can deliver strong read-after-write consistency without serializing every request through a single point. It tracks enough per-object state to decide, at read time, whether a given cache partition's view of an object may be stale — and if so, invalidate and refill from the persistence layer.

The name comes from the classical witness pattern in distributed systems (used in quorum protocols, see e.g. Paris 1986) — a node that participates in ordering decisions without itself holding a full replica.

The problem it solves¶

When a service layers a cache in front of a persistence tier for resilience and throughput, writes and reads can flow through different cache partitions (the write takes one path, the read races through another). Even if both paths eventually converge, for a brief window the read path can return a pre-write view. This is the textbook source of eventual consistency.

For many workloads this is fine. For S3's 2020 transition to read-after-write consistency for new objects and overwrites, it wasn't — the guarantee had to be absolute, and had to land with zero impact to performance, availability, or cost.

The S3 mechanism (Kozlovski's reconstruction)¶

Per the 2024-03-06 High Scalability explainer (Source: sources/2024-03-06-highscalability-behind-aws-s3s-massive-scale):

S3 has a discrete subsystem for storing per-object metadata on the critical path of most requests. Pre-2020, that tier used a "highly resilient caching technology" such that even if the cache was impaired, requests would succeed — but writes and reads could traverse different partitions, which was "the main source of S3's eventual consistency."
S3 introduced new replication logic into the persistence tier that lets the system reason about the per-object order of operations.
A new component was introduced that
acts as a witness to S3 writes — records that an object was written, and with what ordering;
acts as a read barrier — when it sees that a cache's view of an object may be stale, it invalidates the cache entry and forces a read from the persistence layer.

Net: reads that hit a potentially stale cache entry incur a persistence-tier read; reads that can be proven up-to-date (the witness hasn't seen a newer write) can safely return from cache. Writes pay an additional witness interaction; reads pay only when the witness says the cache view is suspect.

Why this is elegant¶

The witness is smaller than the data — it only tracks per-object ordering metadata, not the object contents.
Steady-state reads stay fast — the common case is "cache is current," so the read path doesn't change.
Writes pay the coherency cost, reads only pay when stale — which matches the asymmetry of S3's workload (many more reads than writes for most objects).
Failure mode is conservative — if the witness can't decide, fall through to persistence. Availability doesn't suffer.

Contrast with alternatives¶

Approach	Consistency	Perf hit on reads	Perf hit on writes
No cache, read from persistence	Strong	High (always)	Baseline
Write-through cache, per-partition	Eventual (cross-partition)	Low	Medium
Single coherency broker	Strong	Medium (always gated)	High
Witness pattern (S3 2020)	Strong	Low (gated only on stale)	Low (witness write)
Client-side read-your-writes	Strong-within-client	Low	Low

Seen in¶

sources/2024-03-06-highscalability-behind-aws-s3s-massive-scale — Kozlovski's reconstruction of the 2020 S3 strong-consistency cutover: per-object replication logic in the persistence tier plus a new component that acts as a write-witness and read-barrier. Third-party-explainer-level detail; the component's name, fanout, and failure modes are not publicly disclosed by AWS. See concepts/strong-consistency for the user-visible contract.