CONCEPT Cited by 1 source
ElasticsearchDataSet (EDS)¶
Definition¶
ElasticsearchDataSet (EDS) is the Kubernetes Custom Resource Definition used by Zalando's es-operator to declaratively describe an Elasticsearch cluster. The operator reconciles a StatefulSet of Elasticsearch pods (and their volumes) against the EDS spec; all scale-in, scale-out, and configuration changes happen by mutating the EDS, not by touching pods or the StatefulSet directly.
"Es-operator defines a Kubernetes custom resource, ElasticsearchDataSet (EDS), that describes the Elasticsearch cluster. It monitors changes to it and maintains a StatefulSet that consists of pods and volumes that implement the Elasticsearch nodes."
(Source: sources/2024-06-20-zalando-failing-to-auto-scale-elasticsearch-in-kubernetes)
What belongs in the spec¶
The canonical fields an EDS exposes (derived from the es-operator README and the post's narrative):
- Replica count — the number of Elasticsearch nodes. The cronjob-based scaling in the Zalando Lounge post mutates exactly this field.
- Resource requests / limits — CPU, memory,
ephemeral-storage. - Pod template — container image (specific Elasticsearch version), volume mounts, env vars.
- Scaling behaviour — thresholds for the operator's autoscaler, if enabled.
- Persistent volume claims (PVC) — optional; if absent, ephemeral storage is used (the Zalando Lounge choice).
Role in the scaling flow¶
The Lounge team's scale-in / scale-out flow:
- Kubernetes CronJob fires (nightly scale-down, morning scale-up, experimental workload scale-down).
- CronJob
kubectl patch-es the EDS'sreplicasfield. - systems/es-operator observes the EDS change via a watch.
- Operator computes the diff between current StatefulSet state and desired EDS state.
- Operator executes the change: on scale-in, drain the highest-ordinal pod and remove it; on scale-out, extend the StatefulSet.
- Operator writes status back to the EDS resource.
Two EDS updates in flight at once (stuck scale-in, incoming scale-out) is the scenario that exposed es-operator's ctx-cancellation bug — the intended abort-on-spec-change contract failed to hold inside one retry loop in the drain code.
Relation to StatefulSet¶
EDS is a higher-level abstraction than StatefulSet. The operator's reconciler maps EDS fields into a StatefulSet underneath; users of EDS don't manage StatefulSets directly. But the StatefulSet semantics (notably highest-ordinal scale-in) still leak through because the operator delegates pod lifecycle to the StatefulSet controller — the EDS can't override the StatefulSet ordinal-based removal order without re-implementing pod lifecycle itself (cf. patterns/custom-operator-over-statefulset, which is the "replace StatefulSets entirely" architectural choice PlanetScale's Vitess Operator made).
Seen in¶
- sources/2024-06-20-zalando-failing-to-auto-scale-elasticsearch-in-kubernetes — canonical wiki instance. Lounge mutates the
replicasfield via cron-triggered scale-in / scale-out; es-operator reconciles the StatefulSet underneath; two-updates-in-flight scenario exposed the retry-loop ctx-cancellation bug.
Related¶
- systems/es-operator — the operator consuming this CRD.
- systems/elasticsearch — the workload described.
- systems/kubernetes — the API substrate.
- concepts/kubernetes-operator-pattern — the abstract pattern.
- concepts/statefulset-highest-ordinal-scale-in — the StatefulSet semantic that leaks through.
- concepts/operator-reconcile-abort-on-spec-change — the EDS-watch responsiveness contract.