Skip to content

CONCEPT Cited by 1 source

Hot-code swapping

Definition

Hot-code swapping (live-code reload) is the ability to replace the compiled code of a running process — without restarting it — so new requests execute the new code while in-flight requests finish on the old code. It's distinct from config reload (no code change) and from blue-green deployment (whole process replaced).

The canonical wiki instance is Meta's Sigma anti-abuse rule engine, where every commit to the policy repository is expected to reach the fleet "in minutes" without dropping requests (sources/2015-06-26-meta-fighting-spam-with-haskell).

Three enabling conditions (Sigma)

Changing running code is a hard problem in general. Sigma sidesteps the general case by relying on three domain-specific properties:

  1. Requests are short-lived. No need to migrate a running request from old to new code mid-execution — the old request is allowed to complete on the old code; only new requests land on the new code.
  2. Code for persistent state is never hot-swapped. Only stateless policy code is swapped; data-layer invariants hold across the transition.
  3. The garbage collector detects when the old code is no longer in use. Because Haskell / GHC is a managed runtime with a reachability-based garbage collector, old compiled code is memory, and the GC can observe when no live thread still holds a reference to it. At that point it is safe to unload. Meta contributed the specific GC change enabling this detection upstream to GHC.

Mechanism

  • Loading. Sigma uses GHC's built-in runtime linker to load freshly compiled policy object code. Meta's post notes that "in principle, we could use the system dynamic linker" — the choice is pragmatic.
  • Routing. New requests enter Sigma serving on the new code; old requests continue on the old code until they finish.
  • Unloading. The GC detects that no live request is still referencing the old code and triggers its removal from the process's address space.

Why this matters

Hot-code swapping is the runtime property that makes Sigma's operational posture — "source code in the repository is the code running in Sigma" (patterns/rule-engine-with-continuous-policy-deploy) — actually fast. Without it, every policy change would require a rolling restart of a large fleet, bounded below by rollout mechanics rather than commit-to-deploy ambition.

  • Erlang / OTP has hot-code reload as a language-level feature, with explicit support for in-flight process migration between old and new module versions. Sigma's mechanism is simpler by design (no in-flight migration; request-granular old→new cutover).
  • JVM -Xshare / Java-agent Instrumentation.redefineClasses provides a partial variant, subject to JVM constraints (signatures immutable, class hierarchy frozen).
  • eBPF programs can be live-replaced in the kernel via BPF_PROG_TEST_RUN / bpftrace tooling, with different semantics (no request concept; attachment-point swap).

The Sigma-style (GHC-style) mechanism's distinctive property is GC-assisted safe unload: the runtime doesn't need to be told "the old code is done" — reachability analysis discovers it.

Seen in

Last updated · 319 distilled / 1,201 read