Skip to content

PATTERN Cited by 1 source

Cluster-wide aggregation guardrail

The pattern

A cluster-wide aggregation guardrail is a backend-managed cap on the per-request aggregation bucket count, enforced inside the shared search backend — not at the caller. It is the last-line defence against high-cardinality aggregation overload: even if every caller-side guardrail is bypassed, the cluster itself refuses to materialise more than K buckets in a single request.

On Elasticsearch, this is the search.max_buckets cluster setting. From Zalando's 2025-12-16 post-mortem:

"We introduced new runbooks on applying cluster-wide settings like search.max_buckets to limit the size of aggregations on the whole cluster at once." (Source: sources/2025-12-16-zalando-the-day-our-own-queries-dosed-us-inside-zalando-search.)

The Zalando theory-section framing names the guardrail's role directly:

"Elasticsearch enforces soft guardrails like search.max_buckets to prevent a single request from creating an unbounded number of aggregation buckets."

Why it's the last line of defence, not the first

The pattern lives at the bottom of a layered defence stack:

┌───────────────────────────────────────────────────────┐
│ Per-caller dashboards + alerts       (visibility)    │
└───────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────┐
│ Client-side query cost limits        (first gate)    │
│   - dynamic threshold per caller                     │
│   - patterns/application-side-query-limit-*          │
└───────────────────────────────────────────────────────┘
              (query slipped past client gate)
┌───────────────────────────────────────────────────────┐
│ Cluster-side aggregation cap          (last gate)    │
│   - search.max_buckets                               │
│   - max_result_window                                │
│   - THIS pattern                                     │
└───────────────────────────────────────────────────────┘

Each layer has different cost / blast-radius characteristics:

Layer Cost of rejection Blast radius of a miss
Client Low — caller gets clean error, no backend work Small — one caller's bad query
Cluster Medium — coordinator has already accepted, built partial state Large — query can saturate before rejection

A cluster-wide guardrail is a blunt instrument (it applies the same cap to every caller, legitimate or not) but has the property that no caller can bypass it, which is exactly why it needs to exist.

Two complementary Elasticsearch knobs

The Zalando theory section names two cluster-side guardrails in the same breath:

Setting Scope Caps Incident-shape defended
search.max_buckets Cluster-wide dynamic setting Aggregation bucket count per request High-cardinality terms aggregations
max_result_window Index-level setting from + size (result-set pagination) "Scroll the universe" deep-pagination attacks

Both are soft guardrails — they produce a clean rejection error to the caller rather than silent truncation, so callers know to rewrite the query.

Tuning considerations

  • Set initially at a number well above legitimate usage. The purpose is to reject pathological queries, not legitimate large aggregations. Start by instrumenting bucket_count on every aggregation response for a month, then set search.max_buckets above the 99.9th percentile of observed legitimate values.
  • Allow per-caller exceptions via dynamic cluster setting changes (the setting is dynamic in Elasticsearch, so operators can raise it per incident / per business event).
  • Pair with the terms aggregation size parameter — the size on a terms aggregation caps the returned top-N, but does NOT cap the computed intermediate bucket count. A naive terms aggregation on a unique-ID field with size: 10 still computes all buckets before picking the top 10 — search.max_buckets is what catches that pre-size count.
  • Alert on rejected aggregations. A request rejected for exceeding max_buckets is a signal — either a bug in the caller, or a legitimate need the threshold should accommodate. Either way a human should see it.

Seen in

  • sources/2025-12-16-zalando-the-day-our-own-queries-dosed-us-inside-zalando-search — canonical wiki instance. Follow-up action after the self-inflicted-DoS incident. Zalando's Search & Browse team introduced cluster-wide runbooks for applying search.max_buckets as a bounded-blast-radius response lever: during a saturation incident, the operator tightens the cluster-wide cap to reject pathological aggregations that bypass the app-side limiter, trading a small legitimate-query rejection rate for cluster survival.
Last updated · 507 distilled / 1,218 read