CONCEPT Cited by 1 source
Leader election¶
Leader election is a coordination primitive in distributed systems where multiple instances (or nodes) select exactly one among them to perform a particular role or action at any given time — preventing duplicate work and ensuring single-writer semantics.
Definition¶
A leader election mechanism must guarantee:
- At most one leader at any time (safety)
- Eventually some leader is elected when one is needed (liveness)
- Leader detection — other nodes can determine who the current leader is
Common implementations: ZooKeeper ephemeral nodes, etcd leases, DynamoDB conditional writes, Raft consensus.
Netflix Data Canary instance¶
During orchestrator deployments, multiple instances of the Data Canary Orchestrator might be running simultaneously. Leader election ensures only one experiment is triggered per version announcement. Without it, each instance would independently detect the new version and trigger redundant (or conflicting) experiments.
"During orchestrator deployments, multiple instances might be running simultaneously. We implemented safeguards to ensure only one experiment is triggered per version announcement."
General applicability¶
Leader election appears whenever:
- A scheduled job must run exactly once across a fleet
- A coordination action (triggering an experiment, initiating failover) must not be duplicated
- A system-of-record writer must be singleton to avoid split-brain