Skip to content

SYSTEM Cited by 4 sources

GraphQL

Definition

GraphQL is a query language and runtime for APIs developed by Facebook (now Meta), open-sourced in 2015. Clients declaratively specify the shape of the response they want; the server responds with exactly that shape. A single GraphQL endpoint can replace many per-resource REST endpoints, and a single query can traverse multiple domains.

Design principles (from the spec)

The June 2018 GraphQL specification calls out two principles that shape how it gets deployed in production:

Hierarchical: GraphQL specification recommends the language to be structured in hierarchy to be well suited for Hierarchical Views in modern frontend applications.

Product-centric: The evolution of a GraphQL schema is directly influenced by the product/business features being developed by frontend engineers.

Both are quoted verbatim in Zalando's 2021 post on why the UBFF architecture works at their scale (Source: sources/2021-03-03-zalando-how-we-use-graphql-at-europes-largest-fashion-e-commerce-company).

Query shape

Canonical example from Zalando's post — fetching a product name:

query {
  product(id: "...") {
    name
  }
}

The client names the field name; only that field comes back. To add a field, the client adds it to the query — the server doesn't change.

Why organisations adopt it

  • Client-driven field selection eliminates over-fetching. Mobile clients get exactly what they need, reducing egress and render time.
  • Single evolution timeline. Adding a feature means adding fields + resolvers; no version-bumping of client binaries.
  • Schema-level deprecation tracking. Field-level usage telemetry lets you deprecate dead fields with confidence.
  • Federation (Apollo Federation, Netflix DGS, etc.) lets multiple services own their slice of one schema.

Architectural patterns built on GraphQL

  • GraphQL as unified API platform — one gateway in front of many backends; Twitter (1.5B fields/sec) and Netflix (1B daily requests) as canonical disclosures.
  • Unified GraphQL BFF — one service, one schema, replacing per-surface BFFs. Zalando's UBFF is the canonical instance.
  • Federated subgraph per domain — multiple team-owned subgraphs composed at a router. Yelp CHAOS + Apollo Federation is the canonical wiki instance.

Trade-offs at scale

  • Resolvers fan out expensively. A naive GraphQL query can trigger many backend calls; protect with DataLoader- style batching + depth/complexity limits.
  • Caching is harder than REST. HTTP caching doesn't map cleanly because every query is a unique URL + body. Use a GraphQL-aware cache or a persisted-query layer.
  • Reference executor can be a bottleneck. Zalando built graphql-jit — a JIT-compiled executor — because the reference graphql-js interpreter was insufficient at their scale.
  • Single-service gateways are single points of failure. Federation distributes ownership at the cost of runtime composition complexity.

Error modeling

GraphQL's response envelope is deliberately schema-free:

{
  "data":   { /* schema-typed */ },
  "errors": [{ "message": "...", "path": [...], "extensions": {...} }]
}

This creates a schema discoverability gap — the schema doesn't describe the shape of errors. Two mechanisms close the gap at different layers:

  • error.extensions — an open-ended metadata channel on each entry of response.errors. The widely-adopted convention is extensions.code (e.g. NOT_FOUND, NOT_AUTHORIZED) so clients switch on a stable code rather than parsing message.
  • Schema-level Problem types — named after RFC 7807 to avoid colliding with GraphQL's reserved error, typically appearing on a [[patterns/result-union-type-for-mutation-outcome|union ...Result = Success | Problem]] returned from a mutation.

Zalando's 2021 error-modeling post codifies when to use each: classify by who can act on the failure (the action-taker classification). Customer-actionable → schema Problem type; Developer- or operator-actionable → response.errors with a code. The two axes Zalando names for making the split are error propagation semantics (schema Problems don't propagate; Errors do) and bug-vs-domain-state encapsulation (bugs must stay out of the schema).

See concepts/problem-vs-error-distinction for the naming discipline.

Notable implementations on the wiki

Seen in

Last updated · 476 distilled / 1,218 read