CONCEPT Cited by 1 source

ElasticsearchDataSet (EDS)¶

Definition¶

ElasticsearchDataSet (EDS) is the Kubernetes Custom Resource Definition used by Zalando's es-operator to declaratively describe an Elasticsearch cluster. The operator reconciles a StatefulSet of Elasticsearch pods (and their volumes) against the EDS spec; all scale-in, scale-out, and configuration changes happen by mutating the EDS, not by touching pods or the StatefulSet directly.

"Es-operator defines a Kubernetes custom resource, ElasticsearchDataSet (EDS), that describes the Elasticsearch cluster. It monitors changes to it and maintains a StatefulSet that consists of pods and volumes that implement the Elasticsearch nodes."

(Source: sources/2024-06-20-zalando-failing-to-auto-scale-elasticsearch-in-kubernetes)

What belongs in the spec¶

The canonical fields an EDS exposes (derived from the es-operator README and the post's narrative):

Replica count — the number of Elasticsearch nodes. The cronjob-based scaling in the Zalando Lounge post mutates exactly this field.
Resource requests / limits — CPU, memory, ephemeral-storage.
Pod template — container image (specific Elasticsearch version), volume mounts, env vars.
Scaling behaviour — thresholds for the operator's autoscaler, if enabled.
Persistent volume claims (PVC) — optional; if absent, ephemeral storage is used (the Zalando Lounge choice).

Role in the scaling flow¶

The Lounge team's scale-in / scale-out flow:

Kubernetes CronJob fires (nightly scale-down, morning scale-up, experimental workload scale-down).
CronJob kubectl patch-es the EDS's replicas field.
systems/es-operator observes the EDS change via a watch.
Operator computes the diff between current StatefulSet state and desired EDS state.
Operator executes the change: on scale-in, drain the highest-ordinal pod and remove it; on scale-out, extend the StatefulSet.
Operator writes status back to the EDS resource.

Two EDS updates in flight at once (stuck scale-in, incoming scale-out) is the scenario that exposed es-operator's ctx-cancellation bug — the intended abort-on-spec-change contract failed to hold inside one retry loop in the drain code.

Relation to StatefulSet¶

EDS is a higher-level abstraction than StatefulSet. The operator's reconciler maps EDS fields into a StatefulSet underneath; users of EDS don't manage StatefulSets directly. But the StatefulSet semantics (notably highest-ordinal scale-in) still leak through because the operator delegates pod lifecycle to the StatefulSet controller — the EDS can't override the StatefulSet ordinal-based removal order without re-implementing pod lifecycle itself (cf. patterns/custom-operator-over-statefulset, which is the "replace StatefulSets entirely" architectural choice PlanetScale's Vitess Operator made).

Seen in¶

sources/2024-06-20-zalando-failing-to-auto-scale-elasticsearch-in-kubernetes — canonical wiki instance. Lounge mutates the replicas field via cron-triggered scale-in / scale-out; es-operator reconciles the StatefulSet underneath; two-updates-in-flight scenario exposed the retry-loop ctx-cancellation bug.

systems/es-operator — the operator consuming this CRD.
systems/elasticsearch — the workload described.
systems/kubernetes — the API substrate.
concepts/kubernetes-operator-pattern — the abstract pattern.
concepts/statefulset-highest-ordinal-scale-in — the StatefulSet semantic that leaks through.
concepts/operator-reconcile-abort-on-spec-change — the EDS-watch responsiveness contract.