Skip to content

CONCEPT Cited by 1 source

Split-instruction race window

Definition

A split-instruction race window is the class of correctness bugs that arises when a compiler / assembler expresses a logically single update to runtime-observable state as multiple machine opcodes — creating a one-instruction interval during which that state is partially updated and therefore inconsistent.

If any concurrent observer can read the state during that interval, the bug is reachable. The window is as narrow as "the shortest preemption boundary" — on modern CPUs this is one instruction — but narrowness does not make the bug rare at scale.

The three conditions

All three must hold for this failure class:

  1. The target of the update is observable by some concurrent actor. On modern runtimes the actor is usually the userspace scheduler, garbage collector, or signal handler. Most commonly the target is:

    • The stack pointer sp (read by stack unwinders).
    • The frame pointer / saved link register (read by tracebacks).
    • A pointer visible to a concurrent writer barrier or GC scan.
    • Any shared variable whose individual-step updates are not atomic w.r.t. the observer.
  2. The update is compiled to multiple opcodes. Usually forced by ISA encoding limits — see concepts/immediate-encoding-limit. A fixed-length ISA like ARM64 cannot encode a wide immediate in one ADD and must decompose.

  3. The concurrent observer can fire between any two opcodes. Under cooperative scheduling this is rare — the observer only runs at yield points. Under async preemption (concepts/async-preemption-go) the observer can run at any instruction boundary, so the race window is fully reachable.

Canonical wiki instance

sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler: Go's arm64 compiler emitted the function epilogue's stack pointer adjustment as two ADD opcodes because the 24-bit offset exceeded the ISA's 12-bit ADD immediate. Between the two opcodes, sp pointed into the middle of the stack frame. Go's async preemption ran the GC stack scanner exactly in that window often enough — at Cloudflare's scale of 84 M req/s across 330 cities — to crash ~30 times per day across <10 % of data centers.

The fix eliminates the split by emitting an indivisible register-form ADD using a scratch register: patterns/preemption-safe-compiler-emit.

Why narrowness doesn't make the bug rare at scale

A one-instruction window at ~1 ns on a ~1 GHz core is one part in 10⁹ of wall time — but the relevant ratio is preemptions landing in that window / all preemptions. With many goroutines, frequent GC cycles (each walking every stack), and many cores generating preemptions per second, the absolute count of hits can be comfortably in the dozens per day per fleet. Cloudflare's post explicitly frames this: "The sort of bug that can only really be quantified at a large scale."

Distinguishing features from other race classes

  • Not a data race on a shared variable — the update is local (to one goroutine's stack), but the observer is external (the runtime scheduler). Classical data-race tools (-race flag, ThreadSanitizer) would not find this — the race is between user code and the runtime, at an instruction boundary that user code has no name for.
  • Not a volatile / memory-ordering issue — any ordering model sees the updates in order; the problem is that a mid-sequence read is meaningful in isolation.
  • Not an ABI violation in the conventional sense — the calling convention is respected at function entry and exit; it's only during the epilogue transition that state is inconsistent.

The Go runtime goes to some length to ensure write barriers (the GC-support code emitted around pointer writes) are preemption-safe — an analogous shape: the compiler-emitted sequence "write pointer + mark card" must not be observable in the middle. Any compiler bug that omits the atomicity is the same class of defect as this arm64 stack-pointer bug.

Seen in

Last updated · 200 distilled / 1,178 read