CONCEPT Cited by 1 source

Query shape¶

Definition¶

A query shape is the un-parameterized form of a query, assigned a stable unique ID. A live query is then identified by (shape_id, argument_values). The schema defines a finite set of query shapes; actual running queries are instances of shapes with substituted arguments.

Example (from Figma's LiveGraph):

Shape file_comments:

SELECT * FROM comments WHERE file_id = $1 AND deleted_at IS NULL

Live query: file_comments("live-graph-love") — shape file_comments + argument $1 = "live-graph-love".

Why it matters for invalidation-based caching¶

In an invalidation-based cache, on a row mutation you need to answer: which queries should I invalidate?

The naïve solution is a subscription registry: maintain a map of (shape + args) → set of subscribers and scan it on every change. This doesn't scale — the registry is memory-heavy, and fan-out discovery is expensive under churn.

Query-shape trick: if every query is an instance of a schema-defined shape, then on a row mutation you can:

Walk the (small, fixed) set of schema shapes.
For each shape, substitute the mutation's row values into the shape's parameters.
Emit invalidations for those parameterized queries — regardless of whether anyone's subscribed.

The invalidator never needs to know which queries are active. Every (shape, arg) tuple derived from a mutation either has no subscribers (invalidation is a no-op at the cache) or has subscribers (invalidation evicts the cache entry). The invalidator stays stateless.

Numeric scale of this design at Figma (Source: sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale): ~700 query shapes total in the schema.

Easy vs hard shapes¶

Not every query shape admits one-arg-per-mutation invalidation. Figma's post partitions schema shapes into:

Easy shapes — equality predicates (=, IN (const), table membership). Substitute mutation row values → finite affected parameterizations (typically exactly one). Canonical: SELECT * FROM comments WHERE file_id = $1.
Hard shapes — range or open-ended predicates (>, <, BETWEEN, LIKE 'foo%', date ranges). A mutation affects potentially infinite parameterizations (all queries with a bound before the new value). Canonical:
```
SELECT * FROM comments WHERE file_id = $1 AND created_at > $2
```
A new comment at time T invalidates every query with $2 < T — unbounded.

At Figma, ~11 of ~700 shapes are hard (≈1.6%). Small enough to handle specially, too fundamental to drop.

Normalization rule¶

Figma enforces a schema discipline:

All queries must normalize to (easy-expr) AND (hard-expr).

Queries without a hard part just ignore the second conjunct. This lets Figma:

Invalidate via easy expressions only. Hard expressions are never directly invalidated.
Shard caches by hash(easy-expr) — all hard queries with the same easy-expr colocate on one cache instance.
Evict all hard queries sharing an easy-expr in one op via nonce indirection.

When query-shape-based invalidation works¶

Schema evolves slowly relative to query rate — Figma's schema changes "on a day-to-day basis with code updates" while invalidations happen sub-second. Can precompute + distribute shape info to services before users' queries arrive.
Small, enumerable set of shapes — ~700, not 700M. A DSL or GraphQL-like front-end typically produces this naturally. Ad-hoc SQL analytics does not.
Mutation → affected-shape computability is tractable — equality predicates are the common case; a schema-inspection tool can validate this before you rely on it.

When it doesn't¶

Ad-hoc SQL — the query space is the entire SQL grammar; shapes aren't enumerable.
Compute-heavy queries where the predicate structure is arbitrary — can't reduce mutation → affected-shape statically (e.g. aggregates, window functions with complex partitions). Asana's Worldstore is the post's named counterexample — "designed quite differently".
Dynamically generated queries — query structure emerges at runtime per-request; pre-enumerating shapes misses coverage.

Precedents / neighbors¶

Prepared statement hashing — SQL prepared statements canonicalize SQL text into a statement ID + bound parameters. Same idea, scoped to the DB planner's cache.
GraphQL persisted queries — only pre-registered query documents are accepted. Gives you a finite enumerable shape set at the protocol level.
Request-signature cache keys in CDNs — normalize URL + query
headers to a canonical key so two semantically-equivalent requests share a cache entry.
Kafka Streams' per-topology stores — precompute dependencies between topics/operators to know where a change propagates.

Seen in¶

sources/2026-04-21-figma-keeping-it-100x-with-real-time-data-at-scale — canonical worked example. ~700 shapes in LiveGraph's schema, 11 hard; shape IDs let mutations mechanically produce invalidations without a live-query table.

concepts/invalidation-based-cache — the cache model that query-shape invalidation serves.
patterns/stateless-invalidator — the deployment shape enabled by schema-local (shape-based) invalidation computability.
patterns/nonce-bulk-eviction — how hard-shape invalidation is made tractable without fan-out to infinity.
systems/livegraph — production realization at Figma.