Skip to content

PATTERN Cited by 1 source

Stateless invalidator

Problem

An invalidation-based cache fed by a CDC stream needs to translate row mutations → invalidation messages for affected queries. The natural-but-wrong way to do it is to keep an active-query subscription table (query_id → subscribers) and scan it per mutation. That table grows linearly with active subscriptions, becomes a memory bottleneck, and forces the invalidator into a stateful-service operational model (replication, failover, rebalancing).

Shape

Instead of tracking live queries, make the invalidator schema-aware only:

  1. The invalidator only knows the schema's query shapes, not the set of currently-subscribed queries.
  2. On a row mutation (from CDC / WAL), it iterates the (finite) set of query shapes, substitutes the pre/post-image column values into each shape's parameters, and emits an invalidation for the resulting (shape_id, arg_values) key.
  3. Invalidations are routed to the cache shard responsible for that hash — which evicts the entry if present or no-ops if absent. Upstream notification handled by cache→edge fan-out.

Key property: the invalidator carries no per-query state. Every decision is a pure function of (schema, row_mutation). Therefore it's stateless compute.

Consequences

  • Horizontal scaling is trivial. Invalidators are partitioned the same way as the upstream DB (one invalidator shard per DB shard); each only processes its own WAL; no cross-node coordination.
  • No failover complexity. Restart re-tails WAL from the last acknowledged position; no subscription state to reconstruct.
  • Operational model is "Lambda-shaped" even if deployed as a long-running service — durable input = WAL LSN; durable output = emitted invalidations; no state in the middle.
  • Cache topology is agnostic to DB topology. The invalidator is the single service bridging the two; edges + caches don't have to know how many DB shards there are.

Pre-conditions

This pattern works when:

  • Query shapes are enumerable — the schema defines a small, finite set of shapes (as in Figma's ~700). Ad-hoc SQL breaks this.
  • Mutation → affected-shapes computable from the schema alone. Most easy shapes (equality) are; hard shapes (ranges, inequalities) require a special sidecar like patterns/nonce-bulk-eviction to stay tractable.
  • Invalidations-on-non-subscribed-queries are cheap — caches can no-op unknown keys without cost. This holds for typical hash-sharded caches.
  • Schema evolves more slowly than the invalidation rate — Figma states schema updates at "a day-to-day basis" vs invalidations at sub-second. Allows pre-distributing shape info to services before queries that need it.

Figma's LiveGraph 100x as the canonical instance

(Source: sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale)

  • Service: invalidator, Go, sharded identically to physical Postgres shards.
  • Input: logical replication stream per shard (WAL-based CDC).
  • Processing: for each row change, iterate the ~700 schema shapes; substitute column values; emit invalidation.
  • Output: invalidation messages to relevant cache shards (the ones whose hash(easy-expr) range covers this invalidation).
  • State: none beyond WAL LSN checkpoint.
  • Pairs with: patterns/nonce-bulk-eviction (hard-query handling), patterns/independent-scaling-tiers (edge / cache / invalidator scale independently).

The architectural win called out in the post:

"Stateless invalidators could be aware of both database topology and the cache sharding strategy to deliver invalidations only to relevant caches, removing the excessive fan-in and fan-out of all database updates."

Anti-patterns it replaces

  • Subscription registry in the invalidator — list active queries, scan on every mutation. Fan-in (every server processes every shard's updates) and fan-out (every update delivered everywhere) are both O(shards × servers).
  • Push-to-every-server — the pre-100x LiveGraph architecture. Mutations broadcast to every LiveGraph server for local cache mutation. Scales poorly on either fleet size or update rate.

When it doesn't apply

  • Free-form SQL query engines — query shapes aren't enumerable; schema inspection can't predict affected queries.
  • Workloads where mutation → affected queries requires query execution itself (complex aggregates, joins with dynamic join keys). Asana's Worldstore is the post's named counterexample.
  • Schema change rate ≈ invalidation rate — no pre-distribution window; effectively the shape set is unbounded.

Precedents / neighbors

  • GraphQL persisted queries — the protocol-level prerequisite: fix the query set, then invalidation from schema becomes mechanical.
  • Materialized-view incremental maintenance in databases (IVM) — classical version: given a base-table update and a registered view definition, compute affected rows. Same idea, system-internal.
  • Kafka Streams repartitioning-by-key — the DB-topology-aware routing cousin: route updates to the partition that owns the key, don't broadcast.
  • AWS Lambda with an SQS/Kinesis source — Lambda-as-stateless- event-processor is the serverless deployment shape of this pattern, and Lambda's PR/FAQ frames stateless-by-contract as an enabling constraint for exactly this kind of workload.

Seen in

Last updated · 200 distilled / 1,178 read