CONCEPT Cited by 1 source
Query shape¶
Definition¶
A query shape is the un-parameterized form of a query,
assigned a stable unique ID. A live query is then identified by
(shape_id, argument_values). The schema defines a finite set of
query shapes; actual running queries are instances of shapes with
substituted arguments.
Example (from Figma's LiveGraph):
- Shape
file_comments: - Live query:
file_comments("live-graph-love")— shapefile_comments+ argument$1 = "live-graph-love".
Why it matters for invalidation-based caching¶
In an invalidation-based cache, on a row mutation you need to answer: which queries should I invalidate?
The naïve solution is a subscription registry: maintain a map of
(shape + args) → set of subscribers and scan it on every change.
This doesn't scale — the registry is memory-heavy, and fan-out
discovery is expensive under churn.
Query-shape trick: if every query is an instance of a schema-defined shape, then on a row mutation you can:
- Walk the (small, fixed) set of schema shapes.
- For each shape, substitute the mutation's row values into the shape's parameters.
- Emit invalidations for those parameterized queries — regardless of whether anyone's subscribed.
The invalidator never needs to know which queries are active. Every (shape, arg) tuple derived from a mutation either has no subscribers (invalidation is a no-op at the cache) or has subscribers (invalidation evicts the cache entry). The invalidator stays stateless.
Numeric scale of this design at Figma (Source: sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale): ~700 query shapes total in the schema.
Easy vs hard shapes¶
Not every query shape admits one-arg-per-mutation invalidation. Figma's post partitions schema shapes into:
-
Easy shapes — equality predicates (
=,IN (const), table membership). Substitute mutation row values → finite affected parameterizations (typically exactly one). Canonical:SELECT * FROM comments WHERE file_id = $1. -
Hard shapes — range or open-ended predicates (
A new comment at time T invalidates every query with>,<,BETWEEN,LIKE 'foo%', date ranges). A mutation affects potentially infinite parameterizations (all queries with a bound before the new value). Canonical:$2 < T— unbounded.
At Figma, ~11 of ~700 shapes are hard (≈1.6%). Small enough to handle specially, too fundamental to drop.
Normalization rule¶
Figma enforces a schema discipline:
All queries must normalize to
(easy-expr) AND (hard-expr).
Queries without a hard part just ignore the second conjunct. This lets Figma:
- Invalidate via easy expressions only. Hard expressions are never directly invalidated.
- Shard caches by
hash(easy-expr)— all hard queries with the same easy-expr colocate on one cache instance. - Evict all hard queries sharing an easy-expr in one op via nonce indirection.
When query-shape-based invalidation works¶
- Schema evolves slowly relative to query rate — Figma's schema changes "on a day-to-day basis with code updates" while invalidations happen sub-second. Can precompute + distribute shape info to services before users' queries arrive.
- Small, enumerable set of shapes — ~700, not 700M. A DSL or GraphQL-like front-end typically produces this naturally. Ad-hoc SQL analytics does not.
- Mutation → affected-shape computability is tractable — equality predicates are the common case; a schema-inspection tool can validate this before you rely on it.
When it doesn't¶
- Ad-hoc SQL — the query space is the entire SQL grammar; shapes aren't enumerable.
- Compute-heavy queries where the predicate structure is arbitrary — can't reduce mutation → affected-shape statically (e.g. aggregates, window functions with complex partitions). Asana's Worldstore is the post's named counterexample — "designed quite differently".
- Dynamically generated queries — query structure emerges at runtime per-request; pre-enumerating shapes misses coverage.
Precedents / neighbors¶
- Prepared statement hashing — SQL prepared statements canonicalize SQL text into a statement ID + bound parameters. Same idea, scoped to the DB planner's cache.
- GraphQL persisted queries — only pre-registered query documents are accepted. Gives you a finite enumerable shape set at the protocol level.
- Request-signature cache keys in CDNs — normalize URL + query
- headers to a canonical key so two semantically-equivalent requests share a cache entry.
- Kafka Streams' per-topology stores — precompute dependencies between topics/operators to know where a change propagates.
Seen in¶
- sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale — canonical worked example. ~700 shapes in LiveGraph's schema, 11 hard; shape IDs let mutations mechanically produce invalidations without a live-query table.
Related¶
- concepts/invalidation-based-cache — the cache model that query-shape invalidation serves.
- patterns/stateless-invalidator — the deployment shape enabled by schema-local (shape-based) invalidation computability.
- patterns/nonce-bulk-eviction — how hard-shape invalidation is made tractable without fan-out to infinity.
- systems/livegraph — production realization at Figma.