PATTERN Cited by 1 source
Disable GraphQL in production¶
Problem¶
A GraphQL endpoint that accepts arbitrary, well-formed queries is by design open to any query shape a client cares to send. That openness is the ergonomic payoff of GraphQL at development time — clients can evolve queries without coordinating with the server. At production altitude the same openness becomes a liability:
- The set of queries running against the graph is not knowable. Field-level usage, per-query p99, and "can we safely break this field?" are all approximate at best.
- Arbitrary-query DoS (deep recursion, expensive resolvers) is an inherent attack surface.
- Schema evolution lacks a proof obligation — "nothing breaks" has to be inferred from sampled telemetry.
Zalando's post names this framing and takes the counter-intuitive move: "to disable GraphQL in production." The production endpoint does not execute raw GraphQL. It executes only pre-registered queries referenced by ID (Source: sources/2022-02-16-zalando-graphql-persisted-queries-and-schema-stability).
Solution¶
Split the GraphQL surface into two regimes:
- Build time. Developers write full GraphQL against a dev-mode endpoint. Codegen, batching, introspection, IDE support, schema exploration — all work normally.
- Production time. The runtime endpoint accepts only
{"id": "<hash>", "variables": {…}}. Thequerykey is not accepted. Unknown IDs are rejected.
The bridge is a build-time persist step: when UI code is merged to main, the build pipeline extracts every query from the source, sends each to a persist endpoint, and receives a stable ID per query. The UI bundle ships with the IDs in place of the query text.
The result is that the production query set is the persisted-queries DB — a closed, finite, versioned catalogue rather than an open accept-anything surface.
DEV ENDPOINT PROD ENDPOINT
{ {
query: "query { product {… } }" id: "a1b2c3",
variables: {…} variables: {…}
} }
│ │
│ accepts │ rejects unknown IDs
│ arbitrary GraphQL │ executes only
│ (for UI build & dev) │ pre-registered queries
▼ ▼
Why this is not just bandwidth reduction¶
Automatic Persisted Queries has a well-known cache-mode variant (Apollo's default) where unknown hashes trigger a retry with full query text. That variant is a bandwidth optimisation — the endpoint is still open to arbitrary queries. The Zalando post is explicit that it "took a different approach": the endpoint refuses unknown hashes entirely (Source: sources/2022-02-16-zalando-graphql-persisted-queries-and-schema-stability).
The difference is contractual, not mechanical. Same persist step, same hashing, but the unknown-hash policy flips a bandwidth optimisation into a schema-stability regime.
What the closed query set buys¶
Three capabilities Zalando calls out, each decidable by static analysis of the persisted-queries DB:
- Per-query monitoring. Each query has a stable identity; SLOs are per-ID.
- Safe breaking changes. A schema change is safe precisely when no persisted query references the field being changed. This is a proof, not an estimate — see concepts/graphql-schema-usage-observability.
- Directive-based field lifecycle —
@draftblocks persisting queries that reference unstable fields, and@allowedForrestricts persisting to named components. These directives have teeth because the endpoint cannot bypass them with a raw query. See patterns/directive-based-field-lifecycle.
When this pattern fits¶
- UI bundles are under your control. Web and mobile apps you ship can be forced through the persist step at build time.
- Query shapes are statically enumerable. Clients do not assemble queries from dynamic runtime state in ways that are not predictable ahead of time.
- Schema stability / breaking-change safety is a high- priority goal. Otherwise the added persist-step complexity buys less than cache-mode APQ.
When this pattern does not fit¶
- Third-party integrations. You need a public-facing GraphQL endpoint that external developers call with novel queries — Shopify Storefront, GitHub GraphQL API, etc. There, the API is the value proposition.
- Highly dynamic query shapes. Dashboards that compose queries based on user-selected filters at runtime.
- No UI build pipeline. Web components or micro- frontends that are authored outside your build infrastructure and can't be pushed through the persist step.
In these cases, Apollo-style cache mode plus field-level usage telemetry and schema-review discipline is the practical ceiling.
Operational considerations¶
- Persist endpoint is the governance choke point. Who can add entries? Unspecified in the Zalando post; non-trivial question in a shared-ownership monorepo with 150+ contributors.
- Normalisation must agree between persist and execute. A one-character disagreement means production bundles cease to resolve.
- Versioning / GC of the DB. When a bundle that
referenced ID
Xis no longer deployed anywhere, isXkept, deprecated, deleted? - Emergency overrides. If a fix requires a new query and the persist pipeline is down, what is the escape hatch? Re-enabling raw-query execution temporarily? Unspecified.
- Dev / staging / prod parity. The persist step must run identically in each environment, or staging traffic will test different queries than prod runs.
Seen in¶
- Zalando UBFF — the canonical instance (sources/2022-02-16-zalando-graphql-persisted-queries-and-schema-stability, systems/zalando-graphql-ubff).
Related¶
- concepts/graphql-persisted-queries — the mechanism.
- concepts/graphql-schema-usage-observability — the capability this pattern unlocks.
- concepts/draft-schema-field — the
@draftdirective that piggybacks on this pattern. - concepts/component-scoped-field-access — the
@component/@allowedForpair that also rides on this pattern. - patterns/automatic-persisted-queries — the parent pattern class; this page is the gate-mode instance.
- patterns/directive-based-field-lifecycle — the directive-on-top lifecycle this pattern enables.
- systems/graphql — the ergonomic payoff preserved at build time.
- systems/zalando-graphql-ubff — where it's deployed.