Skip to content

CONCEPT Cited by 1 source

ElasticsearchDataSet (EDS)

Definition

ElasticsearchDataSet (EDS) is the Kubernetes Custom Resource Definition used by Zalando's es-operator to declaratively describe an Elasticsearch cluster. The operator reconciles a StatefulSet of Elasticsearch pods (and their volumes) against the EDS spec; all scale-in, scale-out, and configuration changes happen by mutating the EDS, not by touching pods or the StatefulSet directly.

"Es-operator defines a Kubernetes custom resource, ElasticsearchDataSet (EDS), that describes the Elasticsearch cluster. It monitors changes to it and maintains a StatefulSet that consists of pods and volumes that implement the Elasticsearch nodes."

(Source: sources/2024-06-20-zalando-failing-to-auto-scale-elasticsearch-in-kubernetes)

What belongs in the spec

The canonical fields an EDS exposes (derived from the es-operator README and the post's narrative):

  • Replica count — the number of Elasticsearch nodes. The cronjob-based scaling in the Zalando Lounge post mutates exactly this field.
  • Resource requests / limits — CPU, memory, ephemeral-storage.
  • Pod template — container image (specific Elasticsearch version), volume mounts, env vars.
  • Scaling behaviour — thresholds for the operator's autoscaler, if enabled.
  • Persistent volume claims (PVC) — optional; if absent, ephemeral storage is used (the Zalando Lounge choice).

Role in the scaling flow

The Lounge team's scale-in / scale-out flow:

  1. Kubernetes CronJob fires (nightly scale-down, morning scale-up, experimental workload scale-down).
  2. CronJob kubectl patch-es the EDS's replicas field.
  3. systems/es-operator observes the EDS change via a watch.
  4. Operator computes the diff between current StatefulSet state and desired EDS state.
  5. Operator executes the change: on scale-in, drain the highest-ordinal pod and remove it; on scale-out, extend the StatefulSet.
  6. Operator writes status back to the EDS resource.

Two EDS updates in flight at once (stuck scale-in, incoming scale-out) is the scenario that exposed es-operator's ctx-cancellation bug — the intended abort-on-spec-change contract failed to hold inside one retry loop in the drain code.

Relation to StatefulSet

EDS is a higher-level abstraction than StatefulSet. The operator's reconciler maps EDS fields into a StatefulSet underneath; users of EDS don't manage StatefulSets directly. But the StatefulSet semantics (notably highest-ordinal scale-in) still leak through because the operator delegates pod lifecycle to the StatefulSet controller — the EDS can't override the StatefulSet ordinal-based removal order without re-implementing pod lifecycle itself (cf. patterns/custom-operator-over-statefulset, which is the "replace StatefulSets entirely" architectural choice PlanetScale's Vitess Operator made).

Seen in

Last updated · 501 distilled / 1,218 read