PATTERN Cited by 1 source
Write-aside cache for edge links¶
Write-aside cache for edge links is a graph-store caching pattern: cache the existence and last-write timestamp of edge links for a short TTL, so the system can skip a durable- store write of an edge link that already exists. The cache is the source-of-truth for "do we already have this link?" within its TTL window; it is not a replacement for durable storage.
The pattern matters because in a graph layout that uses separated edge link and property namespaces with forward and reverse indexes, a single edge mutation can drive 3+ KV writes to durable storage. Suppressing redundant writes when the link already exists materially reduces write amplification.
Three structural pieces¶
Verbatim from Netflix Graph Abstraction Part I:
"This mechanism is balanced with configurable TTL windows, cache invalidation on deletes, and lease acquisitions with exponential backoff. These strategies provide the necessary consistency guarantees while still allowing the last-write timestamp to be refreshed according to the predefined staleness."
| Piece | Function |
|---|---|
| Configurable TTL | Bounds how long the "this link exists" assertion can be trusted before re-checking durable store. |
| Invalidate on delete | Explicit eviction when a link is removed; readers don't see a still-cached link that's already deleted in storage. |
| Lease + exponential backoff | When multiple writers race to update the same link, only one acquires the lease and writes durably; others back off, observe the refreshed cache, and skip their write. |
Mechanism¶
write_edge_link(S, T):
cache_entry = cache.get(S, T)
if cache_entry is fresh (within TTL):
if cache_entry.last_write_time >= staleness_floor:
# link already known; skip durable write entirely
cache.refresh(S, T, now)
return OK
# cache miss or stale; acquire lease
lease = cache.acquire_lease(S, T)
if lease is held by someone else:
backoff_exponentially()
return write_edge_link(S, T) # retry; loser sees winner's refresh
# we hold the lease; write durably
durable_store.put(S, T, idempotency_token(now))
cache.put(S, T, last_write_time=now)
cache.release_lease(lease)
return OK
delete_edge_link(S, T):
durable_store.delete(S, T, idempotency_token(now))
cache.invalidate(S, T)
# entropy-repair pipeline will retry if delete partially fails
Why edge links specifically¶
The pattern works because edge link records are tiny — just the link identity + a last-write timestamp. They're a perfect fit for a small, high-throughput cache:
- The cache value is essentially a Boolean ("exists?") + a timestamp.
- Cache hit rate is high in graph workloads where the same edges are written repeatedly (heartbeat-style writes refreshing liveness).
- Memory cost per cached entry is minimal.
Edge properties by contrast are read-aside-cached, not write-aside-cached, because property bags are larger and have different access shapes (patterns/read-aside-cache-with-dual-invalidation).
Trade-off — staleness vs. write reduction¶
The TTL window expresses the system's tolerance for stale last-write timestamps. A long TTL maximises write reduction (many redundant writes elided) but means the cache may report the link existed when it has just been deleted; a short TTL narrows the staleness window at the cost of more durable writes.
Netflix configures this per namespace so each graph dataset can pick its own point on the curve.
Lease + backoff to avoid thundering herd¶
When a popular edge is being written by many concurrent paths (common in graph workloads — many traversals lead to the same high-fanout edge), the lease mechanism ensures only one writer goes to durable store. The others observe a refreshed cache entry on retry and skip their write. The exponential backoff prevents lock-step retry storms.
Risks¶
- Cache outage = falling back to all writes hitting durable store. The system must still be sized for the unsupressed rate. Cache is an optimisation, not a load-bearing dependency.
- Stale cache after concurrent delete. A read-aside reader
could see a freshly-deleted link as still existing inside the
TTL window. Mitigation:
invalidate on deletemakes the stale window short; LWW at the storage layer guarantees the read eventually converges. - Lease lock-out under coordinator failure. If the lease store fails, writers can't acquire leases and may either fall through to direct writes (loss of suppression) or back off indefinitely. Lease store needs its own availability story.
When to use¶
- Graph workloads with high write amplification across multi- namespace edges.
- Graphs where the same edges are written repeatedly (liveness, heartbeat, frequent updates).
- Substrates with affordable in-process or low-latency external caches.
When not to use¶
- Edge writes are rare and low-volume.
- Storage layer already deduplicates writes cheaply.
- Cache layer is unreliable and would push the system below the durable-store rate target.
Seen in¶
- sources/2026-05-29-netflix-high-throughput-graph-abstraction-at-netflix-part-i — canonical wiki disclosure; the write-aside cache for edge links is one of two caching strategies disclosed for Netflix Graph Abstraction (the other is read-aside EVCache for properties).