Skip to content

PATTERN Cited by 2 sources

Automatic Persisted Queries

Problem

A GraphQL deployment at scale has two costs that grow with query text: bandwidth (each request carries full query text, which can be kilobytes) and unboundedness (any caller can ship a new query shape, so the set of queries the endpoint must serve is open). For mobile clients the bandwidth is the dominant concern; for any operator concerned with SLOs, monitoring, and schema evolution the unboundedness is.

Solution

Register each query that an application might issue ahead of time — typically at UI build time — and replace the runtime request payload with a stable ID that refers to the pre-registered query.

  build time                               runtime
  ──────────                               ───────

  Client source                            UI bundle
      │                                       │
      │  extract & persist queries            │
      ▼                                       ▼
  GraphQL service ────── ID ──────────▶  bundle embeds
  (persist endpoint)                     { id: "<hash>",
      │                                    variables: {…} }
      └─ writes to                             │
         persisted-queries DB                  ▼
                                         GraphQL server
                                         ─ resolve ID → query
                                         ─ execute
                                         ─ return data

The ID is typically a hash of the normalised query text (whitespace, operation selection, formatting stripped). Equal queries hash to the same ID.

Two enforcement modes

The mechanism is the same; the what-if-unknown policy distinguishes two named instances.

Cache mode (Apollo APQ)

The Apollo Server Automatic Persisted Queries default behaviour:

  1. Client sends {"extensions": {"persistedQuery": {"sha256Hash": "..."}}} first — no query text.
  2. Server returns PersistedQueryNotFound if the hash is unknown.
  3. Client retries with the full query plus the hash.
  4. Server caches the query under the hash for next time.

Goal: bandwidth reduction and warming a cache. The endpoint still accepts arbitrary raw queries. The persisted-queries store is a cache.

Gate mode (Zalando UBFF)

Zalando's UBFF persists queries at UI build time, not on first runtime miss. The runtime endpoint refuses unknown IDs outright — there is no fallback to raw-query execution.

Goal: turn the production query set into a closed, inspectable catalogue. The persisted-queries store is a whitelist. See patterns/disable-graphql-in-production for the framing (Source: sources/2022-02-16-zalando-graphql-persisted-queries-and-schema-stability).

The post is explicit about the contrast — Apollo APQ is named and linked, and Zalando's approach is described as "a different approach." Same mechanism, different contract.

When cache mode is enough

  • The value delivered is bandwidth reduction for mobile clients.
  • The server is happy to compile and serve novel query shapes on demand.
  • Schema stability is enforced by other means (schema review, deprecation tooling, resolver cost limits).

When gate mode pays off

  • The operator wants exact field- level usage knowledge, which requires a closed query set.
  • Safe breaking changes are a high-priority concern and the operator wants to prove, not estimate, that a field is unused before removing it.
  • Directive-based field lifecycle (e.g. @draft as in concepts/draft-schema-field) is wanted; gate mode is what gives @draft teeth.
  • Arbitrary-GraphQL-query DoS (deep recursive queries, expensive resolvers) is a concern.
  • The organisation already owns the full client lifecycle (web + mobile bundles) and can reasonably require a persist step in the build pipeline.

Mechanics — what the server must implement

Either mode:

  1. Normalisation function — stable canonical form for query text, so equivalent queries hash identically.
  2. Hashing function — the post does not name Zalando's choice; Apollo uses SHA-256 by convention.
  3. Persisted-queries store — key-value lookup from ID → query text. No replication, versioning, or GC semantics are specified by the Zalando post.
  4. Persist endpoint — accepts a query, returns an ID.
  5. Execute endpoint — accepts {id, variables}, looks up the query, runs it.

Gate mode additionally:

  1. Reject-unknown-ID policy — at the execute endpoint.
  2. Persist-time validation rules — e.g. the draftRule in concepts/draft-schema-field that refuses to persist queries touching @draft fields.
  3. Component allowlist validator — see concepts/component-scoped-field-access.

Seen in

Gotchas

  • Normalisation drift. If the persist-side and execute-side normalisation disagree by even one character, the same source query produces different IDs — and the bundle IDs no longer resolve. The post does not describe Zalando's defence here.
  • Fragment and variable handling. A query that inlines a fragment hashes differently from the same query using a named fragment. Variable renaming likewise.
  • Client-side query generation. If the client assembles queries dynamically (e.g., adds fields based on user state), the set of "queries the application might issue" is not decidable at build time. Persisted queries work cleanly only when the query set is statically enumerable.
  • Persist-time auth. In gate mode, the persist endpoint becomes the schema-governance choke point. Who can add entries? The Zalando post is silent here; in a shared- ownership monorepo this is load-bearing.
  • Cache eviction / GC. When a UI bundle that referenced ID X is no longer deployed anywhere, is X garbage- collected? Retained forever? Versioned? Unspecified in the Zalando post.
Last updated · 501 distilled / 1,218 read