Skip to content

SYSTEM Cited by 1 source

Databricks Endpoint Discovery Service (EDS)

Databricks' in-house xDS control plane for Kubernetes service discovery and load balancing. A lightweight server that watches the Kubernetes API for Services and EndpointSlices, maintains a live topology view (zone, readiness, shard labels per pod), and streams that topology as xDS / EDS responses to two kinds of clients:

  1. Armeria RPC clients embedded in Scala services (internal service-to-service traffic).
  2. Envoy ingress gateways via standard xDS, programming ClusterLoadAssignment resources (public-facing traffic).

This single source of truth is the trick: internal and external traffic route off the same endpoint state, with the same health/zone metadata, without going through CoreDNS or kube-proxy on the critical path.

What it replaces

  • CoreDNS for intra-cluster service resolution (kept for compatibility, off critical path).
  • kube-proxy's per-connection L4 pod selection, which causes traffic skew on long-lived HTTP/2 / gRPC connections.
  • Kubernetes ClusterIP → pod-IP NAT shim in the kernel.

What it emits

  • xDS responses (LDS/CDS/EDS equivalents; Databricks centres on EDS in the post) to subscribing clients.
  • Endpoint metadata: zone, readiness, shard label, pod health as observed via EndpointSlices.
  • For Envoy consumers: ClusterLoadAssignment resources so ingress routing matches internal routing.

Design notes

  • Bypasses DNS entirely on the critical path. DNS caching and lack of metadata were the explicit motivation; EDS pushes updates rather than relying on TTLs.
  • Read-only projection of Kubernetes state — it doesn't own the truth, it reprojects EndpointSlices into a streaming, topology-aware feed.
  • Horizontal scaling is implicit: clients subscribe only to services they depend on, and the control plane's workload is O(watch + projection), not O(request).
  • Same control plane for multiple consumer shapes (RPC client library + Envoy) — the xDS protocol is the compatibility layer.

Seen in

Last updated · 200 distilled / 1,178 read